Pub Date : 2026-01-13DOI: 10.1109/LSP.2026.3653694
Tanveer Alam Khan;Somanath Pradhan
The effectiveness of conventional active noise control (ANC) system deteriorates significantly when operating against impulsive noise environments. Over the past few years, the hyperbolic family of adaptive filtering algorithms have been extensively applied for suppressing impulsive noise. This work introduces a new exponential hyperbolic secant adaptive filter for active control operation, which is well suited for impulsive noise scenarios. Additionally, the stability condition in relation to the learning rate, steady-state analysis along with the computational complexity are also studied. Simulation outcomes based on measured acoustic paths demonstrate the efficiency of the proposed algorithm under strong and dynamic impulsive environment.
{"title":"Robust Exponential Hyperbolic Secant Algorithm for Active Control Against Impulsive Noise Environments","authors":"Tanveer Alam Khan;Somanath Pradhan","doi":"10.1109/LSP.2026.3653694","DOIUrl":"https://doi.org/10.1109/LSP.2026.3653694","url":null,"abstract":"The effectiveness of conventional active noise control (ANC) system deteriorates significantly when operating against impulsive noise environments. Over the past few years, the hyperbolic family of adaptive filtering algorithms have been extensively applied for suppressing impulsive noise. This work introduces a new exponential hyperbolic secant adaptive filter for active control operation, which is well suited for impulsive noise scenarios. Additionally, the stability condition in relation to the learning rate, steady-state analysis along with the computational complexity are also studied. Simulation outcomes based on measured acoustic paths demonstrate the efficiency of the proposed algorithm under strong and dynamic impulsive environment.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"663-667"},"PeriodicalIF":3.9,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3652378
Rui Yan;Yao Zhao;Shaowei Weng;Lifang Yu
Recently, researchers have shifted focus to reversible data hiding (RDH) schemes for JPEG images. The reinforcement learning (RL) is a solution for RDH to automatically acquire the optimal two-dimensional (2D) mapping for 2D histograms of non-zero quantized alternating current coefficients. However, merely utilizing the payload-distortion reward mechanism (PDRM) in RL cannot inject the payload guidance to the 2D mapping generation process. To tackle this issue, we propose a payload supplementary reward mechanism (PSRM) and incorporate PDRM and PSRM into RL to construct DR-2DNet, a dual-reward guided 2D mapping generation network with considering additional payload guidance. DR-2DNet generates two candidate 2D mappings, one with low distortion generated by merely utilizing PDRM and the other with low distortion and high payload obtained by jointly using PDRM and PSRM. Finally, according to the required payload, the one with the lower distortion selected from two acquired 2D mappings is used for achieving data embedding. To priorly select the frequency bands with low costs for data embedding, a frequency selection strategy combining the smoothness and embedding performance of the frequency band is designed to evaluate the cost of each frequency band, reducing image distortion and preserving the file size. Extensive experiments are conducted on the Kodak dataset and 100 images randomly chosen from the BOSSBase dataset, and the results demonstrate that the proposed method is superior to several related state-of-the-art RDH schemes for JPEG images.
{"title":"A Dual-Reward Guided 2D Mapping Generation Network for JPEG Reversible Data Hiding","authors":"Rui Yan;Yao Zhao;Shaowei Weng;Lifang Yu","doi":"10.1109/LSP.2026.3652378","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652378","url":null,"abstract":"Recently, researchers have shifted focus to reversible data hiding (RDH) schemes for JPEG images. The reinforcement learning (RL) is a solution for RDH to automatically acquire the optimal two-dimensional (2D) mapping for 2D histograms of non-zero quantized alternating current coefficients. However, merely utilizing the payload-distortion reward mechanism (PDRM) in RL cannot inject the payload guidance to the 2D mapping generation process. To tackle this issue, we propose a payload supplementary reward mechanism (PSRM) and incorporate PDRM and PSRM into RL to construct DR-2DNet, a dual-reward guided 2D mapping generation network with considering additional payload guidance. DR-2DNet generates two candidate 2D mappings, one with low distortion generated by merely utilizing PDRM and the other with low distortion and high payload obtained by jointly using PDRM and PSRM. Finally, according to the required payload, the one with the lower distortion selected from two acquired 2D mappings is used for achieving data embedding. To priorly select the frequency bands with low costs for data embedding, a frequency selection strategy combining the smoothness and embedding performance of the frequency band is designed to evaluate the cost of each frequency band, reducing image distortion and preserving the file size. Extensive experiments are conducted on the Kodak dataset and 100 images randomly chosen from the BOSSBase dataset, and the results demonstrate that the proposed method is superior to several related state-of-the-art RDH schemes for JPEG images.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"526-530"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This letter investigates the problem of direction-of-departure (DOD) and direction-of-arrival (DOA) estimation for non-line-of-sight (NLOS) targets in bistatic multiple-input multiple-output (MIMO) radar systems assisted by an intelligent reflecting surface (IRS). To tackle this issue, we propose a covariance tensor subspace-based algorithm. First, the received data is modeled within a tensor framework to preserve their inherent multi-dimensional spatiotemporal structure. Then, a fourth-order covariance tensor is constructed by computing correlations along the temporal dimension. Using the higher-order singular value decomposition (HOSVD), the signal subspace matrix is derived from this covariance tensor. The receive steering matrix is accurately reconstructed by exploiting the property of the Khatri–Rao product for full-column-rank matrices. Based on the estimated signal subspace and the reconstructed steering matrix, DOD and DOA estimation is efficiently performed via the rotational invariance technique combined with a one-dimensional correlation-based method, which provides automatic parameter pairing. Simulation results validate the superiority and effectiveness of the proposed algorithm in estimating angles.
{"title":"Covariance Tensor Decomposition for NLOS Direction Finding in RIS-Aided Bistatic MIMO Radar","authors":"Qian-Peng Xie;Xiao-Peng Li;Ji-Yuan Chen;Ming-Xing Fang","doi":"10.1109/LSP.2026.3652124","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652124","url":null,"abstract":"This letter investigates the problem of direction-of-departure (DOD) and direction-of-arrival (DOA) estimation for non-line-of-sight (NLOS) targets in bistatic multiple-input multiple-output (MIMO) radar systems assisted by an intelligent reflecting surface (IRS). To tackle this issue, we propose a covariance tensor subspace-based algorithm. First, the received data is modeled within a tensor framework to preserve their inherent multi-dimensional spatiotemporal structure. Then, a fourth-order covariance tensor is constructed by computing correlations along the temporal dimension. Using the higher-order singular value decomposition (HOSVD), the signal subspace matrix is derived from this covariance tensor. The receive steering matrix is accurately reconstructed by exploiting the property of the Khatri–Rao product for full-column-rank matrices. Based on the estimated signal subspace and the reconstructed steering matrix, DOD and DOA estimation is efficiently performed via the rotational invariance technique combined with a one-dimensional correlation-based method, which provides automatic parameter pairing. Simulation results validate the superiority and effectiveness of the proposed algorithm in estimating angles.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"574-578"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3652975
Yunjia Zhang;Zhixiong Hu;Mei Li
Accurate blood pressure (BP) monitoring using photoplethysmography (PPG) remains challenging due to noise, individual variability, and nonlinear signal dynamics. In this letter, we present FBPP-Net, a lightweight temporal-mixing framework that integrates sparse and shared Mixture-of-Experts (MoE) modules with quantile regression. The model enables specialized subnetworks to capture diverse temporal dependencies while providing robust probabilistic estimation of systolic and diastolic BP. Without requiring GPU acceleration, FBPP-Net achieves efficient training and inference, making it suitable for real-time wearable applications. Experiments on the UQVS dataset show that FBPP-Net-MoE attains SBP/DBP errors of 2.83/3.54 mmHg, and FBPP-Net-TM achieves 3.08/3.18 mmHg, outperforming XGBoost, LSTM, MLP, Informer, and TSMixer baselines. Furthermore, the analysis of expert activations and temporal segments provides interpretable insights into the localized dynamics driving near-future BP variations, supporting intelligent and explainable physiological monitoring.
{"title":"Modeling Localized PPG for Blood Pressure Forecasting With MoE and Quantile Regression","authors":"Yunjia Zhang;Zhixiong Hu;Mei Li","doi":"10.1109/LSP.2026.3652975","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652975","url":null,"abstract":"Accurate blood pressure (BP) monitoring using photoplethysmography (PPG) remains challenging due to noise, individual variability, and nonlinear signal dynamics. In this letter, we present FBPP-Net, a lightweight temporal-mixing framework that integrates sparse and shared Mixture-of-Experts (MoE) modules with quantile regression. The model enables specialized subnetworks to capture diverse temporal dependencies while providing robust probabilistic estimation of systolic and diastolic BP. Without requiring GPU acceleration, FBPP-Net achieves efficient training and inference, making it suitable for real-time wearable applications. Experiments on the UQVS dataset show that FBPP-Net-MoE attains SBP/DBP errors of 2.83/3.54 mmHg, and FBPP-Net-TM achieves 3.08/3.18 mmHg, outperforming XGBoost, LSTM, MLP, Informer, and TSMixer baselines. Furthermore, the analysis of expert activations and temporal segments provides interpretable insights into the localized dynamics driving near-future BP variations, supporting intelligent and explainable physiological monitoring.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"531-535"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3652120
Song-Kyoo Kim
Customized dynamic filter augmentation (CDFA) presents a novel data augmentation technique for time-series forecasting, adapting convolutional principles from signal processing to emphasize historical patterns through localized correlations and amplitude adjustments. Built upon convolutional filters, local correlations between paired random variables, and statistical forecasting functions from compact data learning, CDFA generates plausible subsequences while preserving original data characteristics. Empirical evaluations on real-world datasets, including stock prices for Apple, Google, AMD, and oil, demonstrate superior root mean square error (RMSE) reductions, with CDFA achieving 81% to 82% improvements over baselines like statistical forecasting from CDL and customized convolutional filters. This approach enhances model efficiency for large-scale sequences, outperforming traditional linear models in capturing shared patterns across diverse applications.
{"title":"Customized Dynamic Filter Augmentation","authors":"Song-Kyoo Kim","doi":"10.1109/LSP.2026.3652120","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652120","url":null,"abstract":"Customized dynamic filter augmentation (CDFA) presents a novel data augmentation technique for time-series forecasting, adapting convolutional principles from signal processing to emphasize historical patterns through localized correlations and amplitude adjustments. Built upon convolutional filters, local correlations between paired random variables, and statistical forecasting functions from compact data learning, CDFA generates plausible subsequences while preserving original data characteristics. Empirical evaluations on real-world datasets, including stock prices for Apple, Google, AMD, and oil, demonstrate superior root mean square error (RMSE) reductions, with CDFA achieving 81% to 82% improvements over baselines like statistical forecasting from CDL and customized convolutional filters. This approach enhances model efficiency for large-scale sequences, outperforming traditional linear models in capturing shared patterns across diverse applications.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"639-642"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3652953
Ping Cao;Chunjie Zhang;Xiaolong Zheng;Yao Zhao
Human-Object Interaction (HOI) detection requires not only recognizing what the interaction is but also understanding where it occurs. Although recent methods have achieved remarkable progress, they often lack effective joint modeling of spatial and semantic information, which is essential for accurate reasoning in complex scenes. In this paper, we propose a Semantic-Spatial Guided Reasoning (SSGR) framework that performs interaction reasoning by jointly modeling global semantic cues and fine-grained spatial priors. Specifically, SSGR constructs pair-specific spatial layouts to encode detailed spatial relationships and introduces a global semantic decoder to learn category-aware semantic representations. A semantic-spatial guided reasoning module further adaptively fuses these complementary cues, enabling unified reasoning and more discriminative interaction understanding. Extensive experiments on HICO-DET and V-COCO demonstrate that SSGR consistently outperforms prior methods under both standard and zero-shot settings, validating the effectiveness of our semantic-spatial reasoning paradigm.
{"title":"Semantic-Spatial Guided Reasoning for Human-Object Interaction Detection","authors":"Ping Cao;Chunjie Zhang;Xiaolong Zheng;Yao Zhao","doi":"10.1109/LSP.2026.3652953","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652953","url":null,"abstract":"Human-Object Interaction (HOI) detection requires not only recognizing <italic>what</i> the interaction is but also understanding <italic>where</i> it occurs. Although recent methods have achieved remarkable progress, they often lack effective joint modeling of spatial and semantic information, which is essential for accurate reasoning in complex scenes. In this paper, we propose a Semantic-Spatial Guided Reasoning (SSGR) framework that performs interaction reasoning by jointly modeling global semantic cues and fine-grained spatial priors. Specifically, SSGR constructs pair-specific spatial layouts to encode detailed spatial relationships and introduces a global semantic decoder to learn category-aware semantic representations. A semantic-spatial guided reasoning module further adaptively fuses these complementary cues, enabling unified reasoning and more discriminative interaction understanding. Extensive experiments on HICO-DET and V-COCO demonstrate that SSGR consistently outperforms prior methods under both standard and zero-shot settings, validating the effectiveness of our semantic-spatial reasoning paradigm.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"551-555"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3653395
Luan Portella;R. J. Cintra;Aluísio Pinheiro
This letter introduces low-complexity approximations for the Wavelet Energy Correlation Screening (WECS) method, which aims at change detection in multitemporal SAR images. The WECS method relies on the non-decimated discrete wavelet transform (ND-DWT) to compute approximation coefficients employed in a feature screening process based on the Pearson correlation. Although effective, WECS presents a high computational cost due to its repeated wavelet filtering stage. To overcome this drawback, we propose two approximations for the wavelet filter coefficients, obtained by truncating their canonical signed digit (CSD) representation, which significantly reduces the number of arithmetic operations. Numerical experiments using both simulated and real-world datasets demonstrate that the proposed methods not only maintain the performance of the original WECS but also achieve computational gains, even outperforming it in certain scenarios.
{"title":"Low-Complexity Approximations of the WECS Method for SAR Change Detection","authors":"Luan Portella;R. J. Cintra;Aluísio Pinheiro","doi":"10.1109/LSP.2026.3653395","DOIUrl":"https://doi.org/10.1109/LSP.2026.3653395","url":null,"abstract":"This letter introduces low-complexity approximations for the Wavelet Energy Correlation Screening (WECS) method, which aims at change detection in multitemporal SAR images. The WECS method relies on the non-decimated discrete wavelet transform (ND-DWT) to compute approximation coefficients employed in a feature screening process based on the Pearson correlation. Although effective, WECS presents a high computational cost due to its repeated wavelet filtering stage. To overcome this drawback, we propose two approximations for the wavelet filter coefficients, obtained by truncating their canonical signed digit (CSD) representation, which significantly reduces the number of arithmetic operations. Numerical experiments using both simulated and real-world datasets demonstrate that the proposed methods not only maintain the performance of the original WECS but also achieve computational gains, even outperforming it in certain scenarios.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"643-647"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3652955
Min Li;Zhaofei Hao;Gang Li;Jin Wan;Delong Han;Mingle Zhou
Small Object Detection (SOD) aims to accurately identify and locate small objects in images. However, existing methods usually focus on exploring spatial domain features, neglecting high-frequency features that preserve fine-grained details such as texture and edge information. To overcome this limitation, we propose a High-Frequency Feature-Oriented Network (HFFO-Net). First, we introduce the Channel-wise Frequency Modulation Module (CFMM), which leverages the 2D Discrete Cosine Transform (DCT) to accentuate salient frequency components while mitigating noise interference. Second, we design a High-Frequency Oriented Module (HFOM), which utilizes the Channel Selection Branch (CSB) and Spatial Selection Branch (SSB) to highlight small objects in the channel and spatial region. Third, we introduce a Dual-Query Attention Fusion Mechanism (DQAFM), which reduces the semantic gap between spatial and frequency features and achieves better feature fusion through bidirectional cross-attention. Extensive experiments are implemented, and the corresponding results demonstrate that HFFO-Net excels at detecting small objects.
{"title":"Boosting Small Object Detection via High-Frequency Feature Oriented Network","authors":"Min Li;Zhaofei Hao;Gang Li;Jin Wan;Delong Han;Mingle Zhou","doi":"10.1109/LSP.2026.3652955","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652955","url":null,"abstract":"Small Object Detection (SOD) aims to accurately identify and locate small objects in images. However, existing methods usually focus on exploring spatial domain features, neglecting high-frequency features that preserve fine-grained details such as texture and edge information. To overcome this limitation, we propose a High-Frequency Feature-Oriented Network (HFFO-Net). First, we introduce the Channel-wise Frequency Modulation Module (CFMM), which leverages the 2D Discrete Cosine Transform (DCT) to accentuate salient frequency components while mitigating noise interference. Second, we design a High-Frequency Oriented Module (HFOM), which utilizes the Channel Selection Branch (CSB) and Spatial Selection Branch (SSB) to highlight small objects in the channel and spatial region. Third, we introduce a Dual-Query Attention Fusion Mechanism (DQAFM), which reduces the semantic gap between spatial and frequency features and achieves better feature fusion through bidirectional cross-attention. Extensive experiments are implemented, and the corresponding results demonstrate that HFFO-Net excels at detecting small objects.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"584-588"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3652127
Kun Lu;Hongli Zhang;Yuchen Yang;Chao Meng;Binxing Fang
Social bots constitute a substantial fraction of active accounts on digital platforms, fundamentally threatening information authenticity and democratic discourse. Contemporary detection methods confront critical limitations: information imbalance across heterogeneous relations, computational challenges in processing massive neighborhoods, and inadequate multi-scale representation learning. We propose HIER (Heterogeneous Information Bottleneck and Expert Routing), a pioneering framework that integrates variational information theory with mixture-of-experts paradigms for social network analysis. HIER introduces relation-aware variational information bottleneck for optimal compression across relationship types, dynamic sparse expert routing that extends mixture-of-experts to edge-level graph processing, and dual-scale mutual information maximization enhancing representation discriminability through neighborhood consistency and graph-level contrastive learning. Experimental validation demonstrates HIER’s superior performance across real-world datasets, establishing new benchmarks for heterogeneous social bot detection.
{"title":"HIER: Heterogeneous Information Bottleneck and Expert Routing for Social Bot Detection","authors":"Kun Lu;Hongli Zhang;Yuchen Yang;Chao Meng;Binxing Fang","doi":"10.1109/LSP.2026.3652127","DOIUrl":"https://doi.org/10.1109/LSP.2026.3652127","url":null,"abstract":"Social bots constitute a substantial fraction of active accounts on digital platforms, fundamentally threatening information authenticity and democratic discourse. Contemporary detection methods confront critical limitations: information imbalance across heterogeneous relations, computational challenges in processing massive neighborhoods, and inadequate multi-scale representation learning. We propose HIER (Heterogeneous Information Bottleneck and Expert Routing), a pioneering framework that integrates variational information theory with mixture-of-experts paradigms for social network analysis. HIER introduces relation-aware variational information bottleneck for optimal compression across relationship types, dynamic sparse expert routing that extends mixture-of-experts to edge-level graph processing, and dual-scale mutual information maximization enhancing representation discriminability through neighborhood consistency and graph-level contrastive learning. Experimental validation demonstrates HIER’s superior performance across real-world datasets, establishing new benchmarks for heterogeneous social bot detection.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"521-525"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.1109/LSP.2026.3653403
Lu Li;Qinkun Xiao;Peiran Liu
Continuous sign language recognition (CSLR) requires fine-grained alignment between visual sequences and gloss annotations under weak supervision, which is challenged by modality heterogeneity and ambiguous frame-to-gloss correspondence. We propose a Multimodal Cosine Similarity Transformer (MMCST) to address these issues. MMCST integrates RGB and keypoint heatmap features via gated fusion, and aligns them with gloss embeddings through a Gloss-Conditioned Cosine-Normalized Attention (GCNA) mechanism that stabilizes cross-modal alignment. To further enhance semantic consistency, we introduce Gloss-aware Contrastive Regularization (GLCR). The fused representation is modeled by a cosine-similarity Transformer and decoded with CTC. Experimental results show that MMCST achieves consistent improvements over strong baselines, and ablation studies confirm the effectiveness of gated fusion, GCNA, and GLCR in improving semantic alignment and yielding smoother training dynamics.
{"title":"Multimodal Cosine Similarity Transformer for Gloss-Guided Sign Language Recognition","authors":"Lu Li;Qinkun Xiao;Peiran Liu","doi":"10.1109/LSP.2026.3653403","DOIUrl":"https://doi.org/10.1109/LSP.2026.3653403","url":null,"abstract":"Continuous sign language recognition (CSLR) requires fine-grained alignment between visual sequences and gloss annotations under weak supervision, which is challenged by modality heterogeneity and ambiguous frame-to-gloss correspondence. We propose a Multimodal Cosine Similarity Transformer (MMCST) to address these issues. MMCST integrates RGB and keypoint heatmap features via gated fusion, and aligns them with gloss embeddings through a Gloss-Conditioned Cosine-Normalized Attention (GCNA) mechanism that stabilizes cross-modal alignment. To further enhance semantic consistency, we introduce Gloss-aware Contrastive Regularization (GLCR). The fused representation is modeled by a cosine-similarity Transformer and decoded with CTC. Experimental results show that MMCST achieves consistent improvements over strong baselines, and ablation studies confirm the effectiveness of gated fusion, GCNA, and GLCR in improving semantic alignment and yielding smoother training dynamics.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"673-677"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}