首页 > 最新文献

IEEE Signal Processing Letters最新文献

英文 中文
Quadratic Equality Constrained Least Squares: Low-Complexity ADMM for Global Optimality 二次等式约束最小二乘:全局最优的低复杂度ADMM
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-19 DOI: 10.1109/LSP.2025.3646132
Tong Wei;Huiping Huang;Linlong Wu;Chong-Yung Chi;Bhavani Shankar M. R.;Björn Ottersten
This letter addresses the quadratic equality constrained least squares (QEC-LS) problem, a class of non-convex optimization problems that arise in various signal processing and communication applications. We revisit the alternating direction method of multipliers (ADMM) approach to QEC-LS problem and investigate its convergence and efficiency. Despite the inherent non-convexity, the proposed ADMM algorithm is proved to converge globally only requiring the quadratic term equal to a positive constant. Numerical results demonstrate that our method achieves global optimality with significantly reduced complexity compared to existing approaches such as semidefinite relaxation and primal-dual methods.
这封信解决二次等式约束最小二乘(QEC-LS)问题,一类非凸优化问题,出现在各种信号处理和通信应用中。本文重新研究了QEC-LS问题的乘法器交替方向法,并研究了其收敛性和效率。尽管ADMM算法具有固有的非凸性,但只要求二次项为正常数即可证明算法具有全局收敛性。数值结果表明,与半定松弛法和原始对偶法等现有方法相比,该方法实现了全局最优性,且复杂度显著降低。
{"title":"Quadratic Equality Constrained Least Squares: Low-Complexity ADMM for Global Optimality","authors":"Tong Wei;Huiping Huang;Linlong Wu;Chong-Yung Chi;Bhavani Shankar M. R.;Björn Ottersten","doi":"10.1109/LSP.2025.3646132","DOIUrl":"https://doi.org/10.1109/LSP.2025.3646132","url":null,"abstract":"This letter addresses the quadratic equality constrained least squares (QEC-LS) problem, a class of non-convex optimization problems that arise in various signal processing and communication applications. We revisit the alternating direction method of multipliers (ADMM) approach to QEC-LS problem and investigate its convergence and efficiency. Despite the inherent non-convexity, the proposed ADMM algorithm is proved to converge globally only requiring the quadratic term equal to a positive constant. Numerical results demonstrate that our method achieves global optimality with significantly reduced complexity compared to existing approaches such as semidefinite relaxation and primal-dual methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"361-365"},"PeriodicalIF":3.9,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11304554","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145879971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Viewport-Patch Extraction Enhanced 360$^circ$ Video Quality Assessment Viewport-Patch提取增强360$^circ$视频质量评估
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-19 DOI: 10.1109/LSP.2025.3646129
Xiaoyu Yan;Chao Yang;Ping An;Xinpeng Huang
With the rising adoption of 360$^circ$ video in virtual reality (VR) applications, assessing its perceptual quality remains a challenge due to projection-induced distortions in equirectangular projection (ERP) formats. Traditional sliding-window cropping methods often distort high-latitude content and fail to reflect the actual viewing experience. To address this, we propose a novel viewport patch-based video quality assessment (VQA) method. By sampling view directions on the sphere and applying gnomonic projection, our method extracts undistorted and perceptually consistent viewport patches that preserve both spatial fidelity and full-frame coverage. We further design a two-stream network that jointly models high-frequency distortion and residual information over time, enhanced by squeeze-and-excitation (SE) attention to capture spatial-temporal features. Experiments and analysis show that our method significantly improves the accuracy and reliability of 360$^circ$ VQA, achieving PLCC/SROCC values of 0.9603/0.9628 on the VQA-ODV dataset and 0.9585/0.9400 on the BIT360 dataset, with only 0.22M parameters.
随着虚拟现实(VR)应用越来越多地采用360°视频,由于在等矩形投影(ERP)格式中投影引起的失真,评估其感知质量仍然是一个挑战。传统的滑动窗口裁剪方法往往会扭曲高纬度内容,无法反映实际的观看体验。为了解决这个问题,我们提出了一种新的基于视口补丁的视频质量评估(VQA)方法。通过采样球体上的视图方向并应用椭圆投影,我们的方法提取了未失真且感知一致的视口补丁,同时保持了空间保真度和全帧覆盖。我们进一步设计了一个双流网络,该网络随着时间的推移共同建模高频失真和残留信息,并通过挤压和激励(SE)注意来增强以捕获时空特征。实验和分析表明,该方法显著提高了360$^circ$ VQA的精度和可靠性,在VQA- odv数据集上PLCC/SROCC值为0.9603/0.9628,在BIT360数据集上PLCC/SROCC值为0.9585/0.9400,参数仅为0.22M。
{"title":"Viewport-Patch Extraction Enhanced 360$^circ$ Video Quality Assessment","authors":"Xiaoyu Yan;Chao Yang;Ping An;Xinpeng Huang","doi":"10.1109/LSP.2025.3646129","DOIUrl":"https://doi.org/10.1109/LSP.2025.3646129","url":null,"abstract":"With the rising adoption of 360<inline-formula><tex-math>$^circ$</tex-math></inline-formula> video in virtual reality (VR) applications, assessing its perceptual quality remains a challenge due to projection-induced distortions in equirectangular projection (ERP) formats. Traditional sliding-window cropping methods often distort high-latitude content and fail to reflect the actual viewing experience. To address this, we propose a novel viewport patch-based video quality assessment (VQA) method. By sampling view directions on the sphere and applying gnomonic projection, our method extracts undistorted and perceptually consistent viewport patches that preserve both spatial fidelity and full-frame coverage. We further design a two-stream network that jointly models high-frequency distortion and residual information over time, enhanced by squeeze-and-excitation (SE) attention to capture spatial-temporal features. Experiments and analysis show that our method significantly improves the accuracy and reliability of 360<inline-formula><tex-math>$^circ$</tex-math></inline-formula> VQA, achieving PLCC/SROCC values of 0.9603/0.9628 on the VQA-ODV dataset and 0.9585/0.9400 on the BIT360 dataset, with only 0.22M parameters.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"386-390"},"PeriodicalIF":3.9,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Underwater Acoustic Channel Estimation via Accelerated TMSBL With KSVD-Based Denoising and Robust Initialization 基于ksvd去噪和鲁棒初始化的加速TMSBL水声信道估计
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-18 DOI: 10.1109/LSP.2025.3645580
Chuanxi Xing;Yiwen Hou;Yihan Meng;Tinglong Huang;Weiqiang Li;Minglinhan Hu
To address the high complexity and noise sensitivity of the Temporal Multiple Sparse Bayesian Learning (TMSBL) algorithm in shallow-sea environments, this letter proposes a novel and robust channel estimation scheme. Our proposed scheme first denoises the received pilot matrix using the K-Singular Value Decomposition (KSVD) algorithm and then leverages the Stagewise Orthogonal Matching Pursuit (StOMP) to acquire a robust sparse prior to initializing the TMSBL framework. This structured approach leverages the temporal correlation between channels for joint estimation, while noise variance is estimated directly from OFDM null subcarriers to enhance stability and efficiency. The simulation results demonstrate the superiority of the proposed method. At an SNR of −10 dB, it reduces the normalized mean square error (NMSE) by more than 94% compared to the standard TMSBL algorithm and reduces the computation time by approximately 95.87%, ensuring higher accuracy and efficiency for underwater acoustic communications.
为了解决浅海环境下时间多重稀疏贝叶斯学习(TMSBL)算法的高复杂性和噪声敏感性,本文提出了一种新颖的鲁棒信道估计方案。我们提出的方案首先使用k -奇异值分解(KSVD)算法对接收到的导频矩阵进行降噪,然后在初始化TMSBL框架之前利用阶段正交匹配追踪(StOMP)获得鲁棒稀疏。这种结构化方法利用信道之间的时间相关性进行联合估计,而直接从OFDM零子载波估计噪声方差以提高稳定性和效率。仿真结果表明了该方法的优越性。在信噪比为−10 dB的情况下,与标准TMSBL算法相比,该算法的归一化均方误差(NMSE)降低了94%以上,计算时间减少了约95.87%,确保了更高的水声通信精度和效率。
{"title":"Underwater Acoustic Channel Estimation via Accelerated TMSBL With KSVD-Based Denoising and Robust Initialization","authors":"Chuanxi Xing;Yiwen Hou;Yihan Meng;Tinglong Huang;Weiqiang Li;Minglinhan Hu","doi":"10.1109/LSP.2025.3645580","DOIUrl":"https://doi.org/10.1109/LSP.2025.3645580","url":null,"abstract":"To address the high complexity and noise sensitivity of the Temporal Multiple Sparse Bayesian Learning (TMSBL) algorithm in shallow-sea environments, this letter proposes a novel and robust channel estimation scheme. Our proposed scheme first denoises the received pilot matrix using the K-Singular Value Decomposition (KSVD) algorithm and then leverages the Stagewise Orthogonal Matching Pursuit (StOMP) to acquire a robust sparse prior to initializing the TMSBL framework. This structured approach leverages the temporal correlation between channels for joint estimation, while noise variance is estimated directly from OFDM null subcarriers to enhance stability and efficiency. The simulation results demonstrate the superiority of the proposed method. At an SNR of −10 dB, it reduces the normalized mean square error (NMSE) by more than 94% compared to the standard TMSBL algorithm and reduces the computation time by approximately 95.87%, ensuring higher accuracy and efficiency for underwater acoustic communications.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"431-435"},"PeriodicalIF":3.9,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DOA Estimation Exploiting Compressive Measurements With Mixed-ADCs 利用混合adc压缩测量的DOA估计
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-18 DOI: 10.1109/LSP.2025.3645593
Muran Guo;Shenao Gu;Limin Guo;Guifu Yang
Aiming at the demands on direction of arrival (DOA) estimation systems with low system cost and power consumption, this letter proposes a new scheme where the spatial compressive sampling and mixed-resolution quantization (MRQ) are adopted. The number of front-end circuit chains is reduced through spatial compressive sampling, thereby lowering the system cost. Additionally, some channels are quantized at low bits, resulting in reduced power consumption. However, spatial compressive sampling, along with MRQ, leads to information loss during the compression procedure. To assess the estimation performance of the proposed scheme, the Cramér-Rao bound (CRB) expression is derived in this letter to quantify performance loss, where the compressive additive quantization noise model is constructed to characterize the effects of MRQ. Overall, the proposed scheme achieves reductions in system cost and power consumption at the expense of only marginal precision degradation. Numerical simulations are conducted to validate the performance of the proposed scheme.
针对低系统成本和低功耗对到达方向估计系统的要求,本文提出了一种采用空间压缩采样和混合分辨率量化(MRQ)的新方案。通过空间压缩采样减少了前端电路链的数量,从而降低了系统成本。此外,一些信道在低比特处被量化,从而降低了功耗。然而,空间压缩采样和MRQ会导致压缩过程中的信息丢失。为了评估所提出方案的估计性能,本文推导了cram - rao界(CRB)表达式来量化性能损失,其中构建了压缩加性量化噪声模型来表征MRQ的影响。总体而言,所提出的方案在仅以边际精度下降为代价的情况下实现了系统成本和功耗的降低。通过数值仿真验证了所提方案的性能。
{"title":"DOA Estimation Exploiting Compressive Measurements With Mixed-ADCs","authors":"Muran Guo;Shenao Gu;Limin Guo;Guifu Yang","doi":"10.1109/LSP.2025.3645593","DOIUrl":"https://doi.org/10.1109/LSP.2025.3645593","url":null,"abstract":"Aiming at the demands on direction of arrival (DOA) estimation systems with low system cost and power consumption, this letter proposes a new scheme where the spatial compressive sampling and mixed-resolution quantization (MRQ) are adopted. The number of front-end circuit chains is reduced through spatial compressive sampling, thereby lowering the system cost. Additionally, some channels are quantized at low bits, resulting in reduced power consumption. However, spatial compressive sampling, along with MRQ, leads to information loss during the compression procedure. To assess the estimation performance of the proposed scheme, the Cramér-Rao bound (CRB) expression is derived in this letter to quantify performance loss, where the compressive additive quantization noise model is constructed to characterize the effects of MRQ. Overall, the proposed scheme achieves reductions in system cost and power consumption at the expense of only marginal precision degradation. Numerical simulations are conducted to validate the performance of the proposed scheme.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"436-440"},"PeriodicalIF":3.9,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESGN-YOLO: Enhancing Multi-Scale Small Object Detection via Efficient Feature Fusion and Adaptive Spatial Modeling ESGN-YOLO:基于高效特征融合和自适应空间建模的多尺度小目标检测
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-16 DOI: 10.1109/LSP.2025.3644313
Zihao Guo;MeiLing Zhong;Shukai Duan;Lidan Wang
Object detection is crucial in remote sensing, surveillance, and autonomous driving. Detecting small objects remains challenging due to limited pixels, redundant backgrounds, and noise from viewpoint and illumination variations. To address these, we propose ESGN-YOLO, a lightweight model with three improvements. The Efficient Feature Fusion Module (EFFM) enhances multi-scale and directional feature extraction. The Shift-Wise Convolution (SWC) Bottleneck refines fine-grained features and suppresses background redundancy. The Group Normalisation Scale Head (GNSH) further improves detection accuracy and efficiency. Experiments on VisDrone2019 and RS-STOD show ESGN-YOLO achieves superior mAP@0.5 (34.5% and 76%) with a compact size (3.7 M parameters) and moderate computational cost (12.3 GFLOPs). Fast inference confirms its practicality for real-time UAV deployment and small-object detection under resource-constrained conditions.
目标检测在遥感、监视和自动驾驶中至关重要。由于有限的像素、冗余的背景以及视点和照明变化带来的噪声,检测小物体仍然具有挑战性。为了解决这些问题,我们提出了ESGN-YOLO,这是一个轻量级模型,有三个改进。高效特征融合模块(EFFM)增强了多尺度和定向特征提取。Shift-Wise卷积(SWC)瓶颈细化了细粒度特征并抑制了背景冗余。组归一化标头(GNSH)进一步提高了检测精度和效率。在VisDrone2019和RS-STOD上的实验表明,ESGN-YOLO在尺寸紧凑(3.7 M参数)和计算成本适中(12.3 GFLOPs)的情况下取得了优异的mAP@0.5(34.5%和76%)性能。快速推理验证了其在资源受限条件下无人机实时部署和小目标检测的实用性。
{"title":"ESGN-YOLO: Enhancing Multi-Scale Small Object Detection via Efficient Feature Fusion and Adaptive Spatial Modeling","authors":"Zihao Guo;MeiLing Zhong;Shukai Duan;Lidan Wang","doi":"10.1109/LSP.2025.3644313","DOIUrl":"https://doi.org/10.1109/LSP.2025.3644313","url":null,"abstract":"Object detection is crucial in remote sensing, surveillance, and autonomous driving. Detecting small objects remains challenging due to limited pixels, redundant backgrounds, and noise from viewpoint and illumination variations. To address these, we propose ESGN-YOLO, a lightweight model with three improvements. The Efficient Feature Fusion Module (EFFM) enhances multi-scale and directional feature extraction. The Shift-Wise Convolution (SWC) Bottleneck refines fine-grained features and suppresses background redundancy. The Group Normalisation Scale Head (GNSH) further improves detection accuracy and efficiency. Experiments on VisDrone2019 and RS-STOD show ESGN-YOLO achieves superior mAP@0.5 (34.5% and 76%) with a compact size (3.7 M parameters) and moderate computational cost (12.3 GFLOPs). Fast inference confirms its practicality for real-time UAV deployment and small-object detection under resource-constrained conditions.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"426-430"},"PeriodicalIF":3.9,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PRISM-Occ: Path-Routed Integrated Sparse Mixture-of-Experts for Multi-Modal BEV Occupancy Prediction 多模态纯电动汽车占用预测的路径路由集成稀疏混合专家
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-16 DOI: 10.1109/LSP.2025.3644948
Yujia Zhang;Hui Zhu;Chen Hua;Xinkai Kuang;Ziyu Chen;Chunmao Jiang
Bird's-eye-view (BEV) occupancy prediction estimates 3D occupied space from sequential sensor data, providing the environment model that underpins downstream planning and decision-making in autonomous driving. Existing methods often rely on dense fusion or naive feature stacking, inflating compute and memory, yielding poorly calibrated probabilities, and training brittleness under occlusion and long-tail categories. We propose PRISM-Occ, a dual-level sparse Mixture-of-Experts framework for multi-modal BEV occupancy. A path-routed hierarchical router (PRHR) with Sparse Top-K activates only a compact set of experts within and across modalities, reducing parameter count while sharpening specialization. A heteroscedastic occupancy head predicts a spatial temperature map to improve calibration, and a simple prior adjustment with a staged hard-sample schedule stabilizes training under occlusion and rare classes. On Occ3D-nuScenes and SurroundOcc, PRISM-Occ achieves state-of-the-art accuracy and better-calibrated probabilities using single-scale 256 × 704 inputs and fixed, lower-resolution backbones, delivering a stronger accuracy–efficiency trade-off with reduced parameters and comparable runtime memory.
鸟瞰图(BEV)占用率预测系统根据序列传感器数据估计3D占用空间,为自动驾驶的下游规划和决策提供环境模型。现有的方法通常依赖于密集融合或朴素特征叠加,膨胀计算和内存,产生校准不良的概率,以及在遮挡和长尾类别下训练脆性。我们提出PRISM-Occ,一个用于多模式BEV占用的双层稀疏专家混合框架。具有稀疏Top-K的路径路由分层路由器(PRHR)仅激活模态内部和模态之间的一组紧凑的专家,减少了参数数量,同时增强了专门化。异方差占用头预测空间温度图以改进校准,并且简单的预先调整与分阶段硬样本时间表稳定遮挡和稀有类别下的训练。在Occ3D-nuScenes和SurroundOcc上,PRISM-Occ使用单尺度256 × 704输入和固定的低分辨率骨干,实现了最先进的精度和更好的校准概率,通过减少参数和相当的运行时内存提供了更强的精度效率折衷。
{"title":"PRISM-Occ: Path-Routed Integrated Sparse Mixture-of-Experts for Multi-Modal BEV Occupancy Prediction","authors":"Yujia Zhang;Hui Zhu;Chen Hua;Xinkai Kuang;Ziyu Chen;Chunmao Jiang","doi":"10.1109/LSP.2025.3644948","DOIUrl":"https://doi.org/10.1109/LSP.2025.3644948","url":null,"abstract":"Bird's-eye-view (BEV) occupancy prediction estimates 3D occupied space from sequential sensor data, providing the environment model that underpins downstream planning and decision-making in autonomous driving. Existing methods often rely on dense fusion or naive feature stacking, inflating compute and memory, yielding poorly calibrated probabilities, and training brittleness under occlusion and long-tail categories. We propose PRISM-Occ, a dual-level sparse Mixture-of-Experts framework for multi-modal BEV occupancy. A path-routed hierarchical router (PRHR) with Sparse Top-K activates only a compact set of experts within and across modalities, reducing parameter count while sharpening specialization. A heteroscedastic occupancy head predicts a spatial temperature map to improve calibration, and a simple prior adjustment with a staged hard-sample schedule stabilizes training under occlusion and rare classes. On Occ3D-nuScenes and SurroundOcc, PRISM-Occ achieves state-of-the-art accuracy and better-calibrated probabilities using single-scale 256 × 704 inputs and fixed, lower-resolution backbones, delivering a stronger accuracy–efficiency trade-off with reduced parameters and comparable runtime memory.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"381-385"},"PeriodicalIF":3.9,"publicationDate":"2025-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual Reference Frame-Based Inter Prediction for MPEG Enhanced G-PCC 基于虚拟参考帧的MPEG增强型G-PCC互连预测
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-15 DOI: 10.1109/LSP.2025.3644314
Xingjian Zhang;Yuxuan Wei;Zhe Liu;Zehan Wang;Hui Yuan
As the demand for 3D point clouds grows, the data volume is growing dramatically. To tackle this challenge, the Moving Picture Expert Group (MPEG) is developing the enhanced geometry-based point cloud compression (Enhanced G-PCC) standard, which uses Region-Adaptive Hierarchical Transform (RAHT) for highly efficient attribute coding. However, since the geometry of the current frame and the reference frame is different, the octree structure between them does not match, which affects the performance of inter prediction. Therefore, we propose a virtual reference frame-based inter prediction method by aligning the geometry of the reference frame and the current frame. Specifically, the geometry of the virtual reference frame comes from the current frame, while its attribute information comes from the reference frame. Experimental results show that the proposed method can significantly increase the proportion of inter predicted RAHT coefficients and thus achieve average Bjøntegaard Delta Rates (BD-rates) of −6.3%, −8.9%, and −8.4% for the Luma, Cb, and Cr components, respectively, under the lossless geometry and lossy attribute coding condition, compared to the state-of-the-art Enhanced G-PCC reference software version 28 release candidate 2 (TMC13v28.0-rc2). For the coding condition of lossy geometry and lossy attribute, the corresponding BD-rates are −6.5%, −11.3%, and −7.7%, respectively.
随着对三维点云需求的增长,数据量也在急剧增长。为了应对这一挑战,运动图像专家组(MPEG)正在开发增强的基于几何的点云压缩(增强型G-PCC)标准,该标准使用区域自适应层次变换(RAHT)进行高效的属性编码。然而,由于当前帧和参考帧的几何形状不同,它们之间的八叉树结构不匹配,影响了相互预测的性能。因此,我们提出了一种基于虚拟参考帧的帧间预测方法,该方法将参考帧的几何形状与当前帧对齐。具体来说,虚拟参照系的几何形状来源于当前参照系,其属性信息来源于参照系。实验结果表明,与目前最先进的Enhanced G-PCC参考软件version 28 release candidate 2 (TMC13v28.0-rc2)相比,在无损几何和有损属性编码条件下,该方法可以显著提高预测间RAHT系数的比例,从而实现Luma、Cb和Cr分量的平均bj / n δ率(bj / n δ率)分别为- 6.3%、- 8.9%和- 8.4%。对于有损几何和有损属性的编码条件,对应的bd -rate分别为- 6.5%、- 11.3%和- 7.7%。
{"title":"Virtual Reference Frame-Based Inter Prediction for MPEG Enhanced G-PCC","authors":"Xingjian Zhang;Yuxuan Wei;Zhe Liu;Zehan Wang;Hui Yuan","doi":"10.1109/LSP.2025.3644314","DOIUrl":"https://doi.org/10.1109/LSP.2025.3644314","url":null,"abstract":"As the demand for 3D point clouds grows, the data volume is growing dramatically. To tackle this challenge, the Moving Picture Expert Group (MPEG) is developing the enhanced geometry-based point cloud compression (Enhanced G-PCC) standard, which uses Region-Adaptive Hierarchical Transform (RAHT) for highly efficient attribute coding. However, since the geometry of the current frame and the reference frame is different, the octree structure between them does not match, which affects the performance of inter prediction. Therefore, we propose a virtual reference frame-based inter prediction method by aligning the geometry of the reference frame and the current frame. Specifically, the geometry of the virtual reference frame comes from the current frame, while its attribute information comes from the reference frame. Experimental results show that the proposed method can significantly increase the proportion of inter predicted RAHT coefficients and thus achieve average Bjøntegaard Delta Rates (BD-rates) of −6.3%, −8.9%, and −8.4% for the Luma, Cb, and Cr components, respectively, under the lossless geometry and lossy attribute coding condition, compared to the state-of-the-art Enhanced G-PCC reference software version 28 release candidate 2 (TMC13v28.0-rc2). For the coding condition of lossy geometry and lossy attribute, the corresponding BD-rates are −6.5%, −11.3%, and −7.7%, respectively.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"301-305"},"PeriodicalIF":3.9,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Locally Shuffled Low Rank Column-Wise Sensing 局部洗牌低秩列感知
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-15 DOI: 10.1109/LSP.2025.3644669
Ahmed Ali Abbasi;Namrata Vaswani
We introduce and precisely formulate the Low Rank Columnwise matrix Sensing (LRCS) problem when some of the observed data is scrambled / permuted / shuffled / unlabeled. Shuffled LRCS is a more difficult problem than just LRCS because there are three unknown variable sets and one of them is discrete. Our proposed algorithm for solving it is the first multi-block generalization of the Alternating GD and Minimization (AltGDmin) algorithm that was introduced in recent work for fast LRCS. Since this is a new problem, no solutions exist. We also develop the AltMin solution and provide extensive numerical comparisons demonstrating that the proposed AltGDmin-based method is much faster than AltMin. As baseline, we use AltGDmin-LRCS and AltMin-LRCS for a collapsed version of this problem, which becomes an LRCS problem. Our experiments show that, when the available number of measurements is small, this fails, while our proposed method works. Finally, we bound the per-iteration time complexity of our algorithm and also provide a guarantee for its initialization step.
我们引入并精确地表述了一些观测数据被打乱/排列/洗牌/未标记时的低秩列阵感知(LRCS)问题。洗牌LRCS是一个比LRCS更困难的问题,因为有三个未知变量集,其中一个是离散的。我们提出的求解该问题的算法是最近在快速LRCS中引入的交替GD和最小化(AltGDmin)算法的第一个多块泛化算法。因为这是一个新问题,所以没有解决办法。我们还开发了AltMin解决方案,并提供了广泛的数值比较,表明所提出的基于altgdmin的方法比AltMin快得多。作为基线,我们使用AltGDmin-LRCS和AltMin-LRCS来解决这个问题的压缩版本,这成为一个LRCS问题。我们的实验表明,当可用的测量数量较少时,这种方法失败,而我们提出的方法有效。最后,我们对算法的每次迭代时间复杂度进行了限定,并对算法的初始化步骤提供了保证。
{"title":"Locally Shuffled Low Rank Column-Wise Sensing","authors":"Ahmed Ali Abbasi;Namrata Vaswani","doi":"10.1109/LSP.2025.3644669","DOIUrl":"https://doi.org/10.1109/LSP.2025.3644669","url":null,"abstract":"We introduce and precisely formulate the Low Rank Columnwise matrix Sensing (LRCS) problem when some of the observed data is scrambled / permuted / shuffled / unlabeled. Shuffled LRCS is a more difficult problem than just LRCS because there are three unknown variable sets and one of them is discrete. Our proposed algorithm for solving it is the first multi-block generalization of the Alternating GD and Minimization (AltGDmin) algorithm that was introduced in recent work for fast LRCS. Since this is a new problem, no solutions exist. We also develop the AltMin solution and provide extensive numerical comparisons demonstrating that the proposed AltGDmin-based method is much faster than AltMin. As baseline, we use AltGDmin-LRCS and AltMin-LRCS for a collapsed version of this problem, which becomes an LRCS problem. Our experiments show that, when the available number of measurements is small, this fails, while our proposed method works. Finally, we bound the per-iteration time complexity of our algorithm and also provide a guarantee for its initialization step.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"446-450"},"PeriodicalIF":3.9,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extended Node-Specific Distributed Generalized Sidelobe Canceler for Outdoor Wireless Acoustic Sensor Networks 面向室外无线声传感器网络的扩展节点特定分布式广义旁瓣对消器
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-15 DOI: 10.1109/LSP.2025.3644315
Shiqin Li;Jing Hu;Zhao Zhao;Zhiyong Xu
In distributed sound source enhancement (SSE) tasks using microphone array nodes, state-of-the-art node-specific distributed generalized sidelobe canceler (NS-DGSC) algorithm has achieved remarkable performance for simultaneously enhancing multiple desired sources. However, its assumption of an equal number of nodes and sources usually does not hold in outdoor applications. This letter proposes an extended NS-DGSC (ENS-DGSC) algorithm to tackle this issue. A correlation check module is introduced to handle scenarios where nodes outnumber or match sources. Furthermore, a temporal alignment module using two different strategies is designed to address time delays among nodes. Evaluations reveal that the proposed ENS-DGSC not only retains advantages of the NS-DGSC, but also provides superior enhancement performance with more nodes than sources.
在使用麦克风阵列节点的分布式声源增强(SSE)任务中,最先进的节点特定分布式广义旁瓣消除(NS-DGSC)算法在同时增强多个期望声源方面取得了显著的性能。然而,其假设的相等数量的节点和源通常不适用于户外应用。本文提出了一种扩展的NS-DGSC (ENS-DGSC)算法来解决这个问题。引入相关性检查模块来处理节点数量超过或匹配源的场景。此外,还设计了使用两种不同策略的时间对齐模块来解决节点间的时间延迟问题。评估结果表明,所提出的NS-DGSC不仅保留了NS-DGSC的优点,而且在节点多于源的情况下具有优越的增强性能。
{"title":"Extended Node-Specific Distributed Generalized Sidelobe Canceler for Outdoor Wireless Acoustic Sensor Networks","authors":"Shiqin Li;Jing Hu;Zhao Zhao;Zhiyong Xu","doi":"10.1109/LSP.2025.3644315","DOIUrl":"https://doi.org/10.1109/LSP.2025.3644315","url":null,"abstract":"In distributed sound source enhancement (SSE) tasks using microphone array nodes, state-of-the-art node-specific distributed generalized sidelobe canceler (NS-DGSC) algorithm has achieved remarkable performance for simultaneously enhancing multiple desired sources. However, its assumption of an equal number of nodes and sources usually does not hold in outdoor applications. This letter proposes an extended NS-DGSC (ENS-DGSC) algorithm to tackle this issue. A correlation check module is introduced to handle scenarios where nodes outnumber or match sources. Furthermore, a temporal alignment module using two different strategies is designed to address time delays among nodes. Evaluations reveal that the proposed ENS-DGSC not only retains advantages of the NS-DGSC, but also provides superior enhancement performance with more nodes than sources.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"306-310"},"PeriodicalIF":3.9,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explicit-Implicit Prompt Injection and Semantic-Guided Latent LoRA for Vision-Language Tracking 用于视觉语言跟踪的显隐提示注入和语义引导的潜在LoRA
IF 3.9 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-12 DOI: 10.1109/LSP.2025.3643354
Jiapeng Zhang;Ying Wei;Yongfeng Li;Gang Yang;Qiaohong Hao
Prompt-based learning has shown promise in visual-language tracking (VLT), yet existing methods often rely on either explicit or implicit prompting alone, limiting fine-grained cross-modal alignment. Moreover, Low-Rank Adaptation (LoRA) -based fine-tuning in prior work typically focuses on visual-only adaptation, overlooking language semantics. To address these issues, we propose a unified VLT framework that integrates Explicit-Implicit Prompt Injection (EIPI) and Semantic-Guided Latent LoRA (SGLL). EIPI introduces semantic prompts to facilitate robust and context-sensitive target modeling through two pathways. The explicit prompts are constructed by interact between multi-modal target representations with the search region, while implicit prompts are learned from linguistic features via a lightweight bottleneck network. Then, SGLL extends standard LoRA by introducing learnable queries in the latent space, allowing residual modulation based on language-visual semantics without retraining the full model. This dual design yields a parameter-efficient tracker with strong cross-modal adaptability. Extensive experiments show our method outperforms prior prompt-based approaches while maintaining high efficiency.
基于提示的学习在视觉语言跟踪(VLT)中显示出前景,但现有的方法通常仅依赖于显式或隐式提示,限制了细粒度的跨模态对齐。此外,先前基于低秩自适应(LoRA)的微调通常只关注视觉自适应,而忽略了语言语义。为了解决这些问题,我们提出了一个统一的VLT框架,该框架集成了显式-隐式提示注入(EIPI)和语义引导的潜在LoRA (SGLL)。EIPI引入语义提示,通过两种途径促进健壮的和上下文敏感的目标建模。显式提示通过多模态目标表示与搜索区域之间的交互构建,而隐式提示通过轻量级瓶颈网络从语言特征中学习。然后,SGLL通过在潜在空间中引入可学习的查询来扩展标准的LoRA,允许基于语言视觉语义的残差调制,而无需重新训练整个模型。这种双重设计产生了具有强跨模态适应性的参数高效跟踪器。大量的实验表明,我们的方法在保持高效率的同时优于先前的基于提示的方法。
{"title":"Explicit-Implicit Prompt Injection and Semantic-Guided Latent LoRA for Vision-Language Tracking","authors":"Jiapeng Zhang;Ying Wei;Yongfeng Li;Gang Yang;Qiaohong Hao","doi":"10.1109/LSP.2025.3643354","DOIUrl":"https://doi.org/10.1109/LSP.2025.3643354","url":null,"abstract":"Prompt-based learning has shown promise in visual-language tracking (VLT), yet existing methods often rely on either explicit or implicit prompting alone, limiting fine-grained cross-modal alignment. Moreover, Low-Rank Adaptation (LoRA) -based fine-tuning in prior work typically focuses on visual-only adaptation, overlooking language semantics. To address these issues, we propose a unified VLT framework that integrates Explicit-Implicit Prompt Injection (EIPI) and Semantic-Guided Latent LoRA (SGLL). EIPI introduces semantic prompts to facilitate robust and context-sensitive target modeling through two pathways. The explicit prompts are constructed by interact between multi-modal target representations with the search region, while implicit prompts are learned from linguistic features via a lightweight bottleneck network. Then, SGLL extends standard LoRA by introducing learnable queries in the latent space, allowing residual modulation based on language-visual semantics without retraining the full model. This dual design yields a parameter-efficient tracker with strong cross-modal adaptability. Extensive experiments show our method outperforms prior prompt-based approaches while maintaining high efficiency.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"376-380"},"PeriodicalIF":3.9,"publicationDate":"2025-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Signal Processing Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1