Underwater object detection suffers from limited long-range dependency modeling, fine-grained feature representation, and noise suppression, resulting in blurred boundaries, frequent missed detections, and reduced robustness. To address these challenges, we propose the Mamba-Driven Feature Pyramid Decoding framework, which employs a parallel Feature Pyramid Network and Path Aggregation Network collaborative pathway to enhance semantic and geometric features. A lightweight Mamba Block models long-range dependencies, while an Adaptive Sparse Self-Attention module highlights discriminative targets and suppresses noise. Together, these components improve feature representation and robustness. Experiments on two publicly available underwater datasets demonstrate that MFPD significantly outperforms existing methods, validating its effectiveness in complex underwater environments. The code is publicly available at: https://github.com/YitengGuo/MFPD
{"title":"MFPD: Mamba-Driven Feature Pyramid Decoding for Underwater Object Detection","authors":"Yiteng Guo;Junpeng Xu;Jiali Wang;Wenyi Zhao;Weidong Zhang","doi":"10.1109/LSP.2025.3639347","DOIUrl":"https://doi.org/10.1109/LSP.2025.3639347","url":null,"abstract":"Underwater object detection suffers from limited long-range dependency modeling, fine-grained feature representation, and noise suppression, resulting in blurred boundaries, frequent missed detections, and reduced robustness. To address these challenges, we propose the Mamba-Driven Feature Pyramid Decoding framework, which employs a parallel Feature Pyramid Network and Path Aggregation Network collaborative pathway to enhance semantic and geometric features. A lightweight Mamba Block models long-range dependencies, while an Adaptive Sparse Self-Attention module highlights discriminative targets and suppresses noise. Together, these components improve feature representation and robustness. Experiments on two publicly available underwater datasets demonstrate that MFPD significantly outperforms existing methods, validating its effectiveness in complex underwater environments. The code is publicly available at: <uri>https://github.com/YitengGuo/MFPD</uri>","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"141-145"},"PeriodicalIF":3.9,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145772094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/LSP.2025.3639375
Liusong Huang;Adam Amril bin Jaharadak;Nor Izzati Ahmad;Jie Wang;Dalin Zhang
Thermal power plants rely on extensive sensor networks to monitor key operational parameters, yet harsh industrial environments often lead to incomplete data characterized by significant noise, complex physical dependencies, and abrupt state transitions. This impedes accurate monitoring and predictive analyses. To address these domain-specific challenges, we propose a novel Hybrid Multi-Head Attention (HybridMHA) model for time series imputation. The core novelty of our approach lies in the synergistic combination of diagonally-masked self-attention and dynamic sparse attention. Specifically, the diagonally-masked component strictly preserves temporal causality to model the sequential evolution of plant states, while the dynamic sparse component selectively identifies critical cross-variable dependencies, effectively filtering out sensor noise. This tailored design enables the model to robustly capture sparse physical inter-dependencies even during abrupt operational shifts.Using a real-world dataset from a thermal power plant, our model demonstrates statistically significant improvements, outperforming existing methods by 10%–20% on key metrics. Further validation on a public benchmark dataset confirms its generalizability. These findings highlight the model's potential for robust real-time monitoring in complex industrial applications.
{"title":"Addressing Missing Data in Thermal Power Plant Monitoring With Hybrid Attention Time Series Imputation","authors":"Liusong Huang;Adam Amril bin Jaharadak;Nor Izzati Ahmad;Jie Wang;Dalin Zhang","doi":"10.1109/LSP.2025.3639375","DOIUrl":"https://doi.org/10.1109/LSP.2025.3639375","url":null,"abstract":"Thermal power plants rely on extensive sensor networks to monitor key operational parameters, yet harsh industrial environments often lead to incomplete data characterized by significant noise, complex physical dependencies, and abrupt state transitions. This impedes accurate monitoring and predictive analyses. To address these domain-specific challenges, we propose a novel Hybrid Multi-Head Attention (HybridMHA) model for time series imputation. The core novelty of our approach lies in the synergistic combination of diagonally-masked self-attention and dynamic sparse attention. Specifically, the diagonally-masked component strictly preserves temporal causality to model the sequential evolution of plant states, while the dynamic sparse component selectively identifies critical cross-variable dependencies, effectively filtering out sensor noise. This tailored design enables the model to robustly capture sparse physical inter-dependencies even during abrupt operational shifts.Using a real-world dataset from a thermal power plant, our model demonstrates statistically significant improvements, outperforming existing methods by 10%–20% on key metrics. Further validation on a public benchmark dataset confirms its generalizability. These findings highlight the model's potential for robust real-time monitoring in complex industrial applications.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"536-540"},"PeriodicalIF":3.9,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/LSP.2025.3638634
Yu Lian;Xinshan Zhu;Di He;Biao Sun;Ruyi Zhang
The rapid development and malicious use of deepfakes pose a significant crisis of trust. To cope with the evolving deepfake technologies, an increasing number of detection methods adopt the continual learning paradigm, but they often suffer from catastrophic forgetting. Although replay-based methods mitigate this issue by storing a portion of samples from historical tasks, their sample selection strategies usually rely on a single metric, which may lead to the omission of critical samples and consequently hinder the construction of a robust instance memory bank. In this letter, we propose a novel Multi-perspective Sample Selection Mechanism (MSSM) for continual deepfake detection, which jointly evaluates prediction error, temporal instability, and sample diversity to preserve informative and challenging samples in the instance memory bank. Furthermore, we design a Hierarchical Prototype Generation Mechanism (HPGM) that constructs prototypes at both the category and task levels, which are stored in the prototype memory bank. Extensive experiments under two evaluation protocols demonstrate that the proposed method achieves state-of-the-art performance.
{"title":"Continual Deepfake Detection Based on Multi-Perspective Sample Selection Mechanism","authors":"Yu Lian;Xinshan Zhu;Di He;Biao Sun;Ruyi Zhang","doi":"10.1109/LSP.2025.3638634","DOIUrl":"https://doi.org/10.1109/LSP.2025.3638634","url":null,"abstract":"The rapid development and malicious use of deepfakes pose a significant crisis of trust. To cope with the evolving deepfake technologies, an increasing number of detection methods adopt the continual learning paradigm, but they often suffer from catastrophic forgetting. Although replay-based methods mitigate this issue by storing a portion of samples from historical tasks, their sample selection strategies usually rely on a single metric, which may lead to the omission of critical samples and consequently hinder the construction of a robust instance memory bank. In this letter, we propose a novel Multi-perspective Sample Selection Mechanism (MSSM) for continual deepfake detection, which jointly evaluates prediction error, temporal instability, and sample diversity to preserve informative and challenging samples in the instance memory bank. Furthermore, we design a Hierarchical Prototype Generation Mechanism (HPGM) that constructs prototypes at both the category and task levels, which are stored in the prototype memory bank. Extensive experiments under two evaluation protocols demonstrate that the proposed method achieves state-of-the-art performance.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"131-135"},"PeriodicalIF":3.9,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Advanced driving simulations are increasingly used in automated driving research, yet freely available data and tools remain limited. We present a new open-source framework for synthetic data generation for lane change (LC) intention recognition in highways. Built on the CARLA simulator, it advances the state-of-the-art by providing a 50-driver dataset, a large-scale 3D map, and code for reproducibility and new data creation. The 60 km highway map includes varying curvature radii and straight segments. The codebase supports simulation enhancements (traffic management, vehicle cockpit, engine noise) and Machine Learning (ML) model training and evaluation, including CARLA log post-processing into time series. The dataset contains over 3,400 annotated LC maneuvers with synchronized ego dynamics, road geometry, and traffic context. From an automotive industry perspective, we also assess leading-edge ML models on STM32 microcontrollers using deployability metrics. Unlike prior infrastructure-based works, we estimate time-to-LC from ego-centric data. Results show that a Transformer model yields the lowest regression error, while XGBoost offers the best trade-offs on extremely resource-constrained devices. The entire framework is publicly released to support advancement in automated driving research.
{"title":"A Deployment-Oriented Simulation Framework for Deep Learning-Based Lane Change Prediction","authors":"Luca Forneris;Riccardo Berta;Matteo Fresta;Luca Lazzaroni;Hadise Rojhan;Changjae Oh;Alessandro Pighetti;Hadi Ballout;Fabio Tango;Francesco Bellotti","doi":"10.1109/LSP.2025.3638676","DOIUrl":"https://doi.org/10.1109/LSP.2025.3638676","url":null,"abstract":"Advanced driving simulations are increasingly used in automated driving research, yet freely available data and tools remain limited. We present a new open-source framework for synthetic data generation for lane change (LC) intention recognition in highways. Built on the CARLA simulator, it advances the state-of-the-art by providing a 50-driver dataset, a large-scale 3D map, and code for reproducibility and new data creation. The 60 km highway map includes varying curvature radii and straight segments. The codebase supports simulation enhancements (traffic management, vehicle cockpit, engine noise) and Machine Learning (ML) model training and evaluation, including CARLA log post-processing into time series. The dataset contains over 3,400 annotated LC maneuvers with synchronized ego dynamics, road geometry, and traffic context. From an automotive industry perspective, we also assess leading-edge ML models on STM32 microcontrollers using deployability metrics. Unlike prior infrastructure-based works, we estimate time-to-LC from ego-centric data. Results show that a Transformer model yields the lowest regression error, while XGBoost offers the best trade-offs on extremely resource-constrained devices. The entire framework is publicly released to support advancement in automated driving research.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"136-140"},"PeriodicalIF":3.9,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/LSP.2025.3638688
Jun Liu;Wei Ke;Shuai Wang;Da Yang;Hao Sheng
Visual tracking that combines RGB and thermal infrared modalities (RGB-T) aims to utilize the useful information of each modality to achieve more robust object localization. Most existing tracking methods based on convolutional neural networks (CNNs) and Transformers emphasize integrating multi-modal features through cross-modal attention, but ignore the potential exploitability of complementary information learned by cross-modal attention for enhancing modal features. In this paper, we propose a novel hierarchical progressive fusion network based on cross-modal attention guided enhancement for RGB-T tracking. Specifically, the complementary information generated by cross-modal attention implicitly reflects the consistent regions of interest of important information between different modalities, which is used to enhance modal features in a targeted manner. In addition, a modal feature refinement module and a fusion module are designed based on dynamic routing to perform noise suppression and adaptive integration on the enhanced multi-modal features. Extensive experiments on GTOT, RGBT234, LasHeR and VTUAV show that our method has competitive performance compared with recent state-of-the-art methods.
{"title":"Cross-Modal Attention Guided Enhanced Fusion Network for RGB-T Tracking","authors":"Jun Liu;Wei Ke;Shuai Wang;Da Yang;Hao Sheng","doi":"10.1109/LSP.2025.3638688","DOIUrl":"https://doi.org/10.1109/LSP.2025.3638688","url":null,"abstract":"Visual tracking that combines RGB and thermal infrared modalities (RGB-T) aims to utilize the useful information of each modality to achieve more robust object localization. Most existing tracking methods based on convolutional neural networks (CNNs) and Transformers emphasize integrating multi-modal features through cross-modal attention, but ignore the potential exploitability of complementary information learned by cross-modal attention for enhancing modal features. In this paper, we propose a novel hierarchical progressive fusion network based on cross-modal attention guided enhancement for RGB-T tracking. Specifically, the complementary information generated by cross-modal attention implicitly reflects the consistent regions of interest of important information between different modalities, which is used to enhance modal features in a targeted manner. In addition, a modal feature refinement module and a fusion module are designed based on dynamic routing to perform noise suppression and adaptive integration on the enhanced multi-modal features. Extensive experiments on GTOT, RGBT234, LasHeR and VTUAV show that our method has competitive performance compared with recent state-of-the-art methods.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"276-280"},"PeriodicalIF":3.9,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the widespread adoption of smart terminals, compressed video is increasingly utilized in the receiver for purposes beyond human vision. Conventional video coding standards are optimized primarily for human visual perception and often fail to accommodate the distinct requirements of machine vision. To simultaneously satisfy the perceptual needs and the analytical demands, we propose a novel rate control scheme based on Versatile Video Coding (VVC) for human-machine vision collaborative video coding. Specifically, we employ the You Only Look Once (YOLO) network to extract task-relevant features for machine vision and formulate a detection feature weight based on these features. Leveraging the feature weight and the spatial location information of Coding Tree Units (CTUs), we propose a region classification algorithm that partitions a frame into machine vision-sensitive region (MVSR) and machine vision non-sensitive region (MVNR). Subsequently, we develop an enhanced and refined bit allocation strategy that performs region-level and CTU-level bit allocation, thereby improving the precision and effectiveness of the rate control. Experimental results demonstrate that the scheme improves machine task detection accuracy while preserving perceptual quality for human observers, effectively meeting the dual encoding requirements of human and machine vision.
随着智能终端的广泛采用,压缩视频越来越多地在接收器中用于超越人类视觉的目的。传统的视频编码标准主要针对人类视觉感知进行优化,往往无法适应机器视觉的独特要求。为了同时满足感知需求和分析需求,提出了一种基于通用视频编码(VVC)的人机视觉协同视频编码速率控制方案。具体来说,我们使用You Only Look Once (YOLO)网络来提取机器视觉的任务相关特征,并基于这些特征制定检测特征权重。利用特征权值和编码树单元(ctu)的空间位置信息,提出了一种将帧划分为机器视觉敏感区(MVSR)和机器视觉非敏感区(MVNR)的区域分类算法。随后,我们开发了一种增强和改进的比特分配策略,可以执行区域级和ctu级的比特分配,从而提高了速率控制的精度和有效性。实验结果表明,该方案在提高机器任务检测精度的同时,保持了人类观察者的感知质量,有效地满足了人类和机器视觉的双重编码要求。
{"title":"Human-Machine Vision Collaboration Based Rate Control Scheme for VVC","authors":"Zeming Zhao;Xiaohai He;Xiaodong Bi;Hong Yang;Shuhua Xiong","doi":"10.1109/LSP.2025.3638597","DOIUrl":"https://doi.org/10.1109/LSP.2025.3638597","url":null,"abstract":"With the widespread adoption of smart terminals, compressed video is increasingly utilized in the receiver for purposes beyond human vision. Conventional video coding standards are optimized primarily for human visual perception and often fail to accommodate the distinct requirements of machine vision. To simultaneously satisfy the perceptual needs and the analytical demands, we propose a novel rate control scheme based on Versatile Video Coding (VVC) for human-machine vision collaborative video coding. Specifically, we employ the You Only Look Once (YOLO) network to extract task-relevant features for machine vision and formulate a detection feature weight based on these features. Leveraging the feature weight and the spatial location information of Coding Tree Units (CTUs), we propose a region classification algorithm that partitions a frame into machine vision-sensitive region (MVSR) and machine vision non-sensitive region (MVNR). Subsequently, we develop an enhanced and refined bit allocation strategy that performs region-level and CTU-level bit allocation, thereby improving the precision and effectiveness of the rate control. Experimental results demonstrate that the scheme improves machine task detection accuracy while preserving perceptual quality for human observers, effectively meeting the dual encoding requirements of human and machine vision.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"126-130"},"PeriodicalIF":3.9,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1109/LSP.2025.3637728
Haitao Yang;Yingzhuo Xiong;Dongliang Zhang;Xiai Yan;Xuran Hu
Small-object detection in uncrewed aerial vehicle (UAV) imagery remains challenging due to limited resolution, complex backgrounds, scale variation, and strict real-time constraints. Existing lightweight detectors often struggle to retain fine details while ensuring efficiency, reducing robustness in UAV applications. This letter proposes a lightweight multi-scale framework integrating Partial Dilated Convolution (PDC), a Triplet Focus Attention Module (TFAM), a Multi-Scale Feature Fusion (MSFF) branch, and a bidirectional BiFPN. PDC enlarges receptive field diversity while preserving local texture, TFAM jointly enhances spatial, channel, and coordinate attention, and MSFF with BiFPN achieves efficient cross-scale fusion. On VisDrone2019, our model reaches 52.7% mAP50 with 6.01M parameters and 148 FPS, and on HIT-UAV yields 85.2% mAP50 and 155 FPS, surpassing state-of-the-art UAV detectors in accuracy and efficiency. Visualization further verifies robustness under low-light, dense, and scale-varying UAV scenes.
{"title":"Lightweight Attention-Enhanced Multi-Scale Detector for Robust Small Object Detection in UAV","authors":"Haitao Yang;Yingzhuo Xiong;Dongliang Zhang;Xiai Yan;Xuran Hu","doi":"10.1109/LSP.2025.3637728","DOIUrl":"https://doi.org/10.1109/LSP.2025.3637728","url":null,"abstract":"Small-object detection in uncrewed aerial vehicle (UAV) imagery remains challenging due to limited resolution, complex backgrounds, scale variation, and strict real-time constraints. Existing lightweight detectors often struggle to retain fine details while ensuring efficiency, reducing robustness in UAV applications. This letter proposes a lightweight multi-scale framework integrating Partial Dilated Convolution (PDC), a Triplet Focus Attention Module (TFAM), a Multi-Scale Feature Fusion (MSFF) branch, and a bidirectional BiFPN. PDC enlarges receptive field diversity while preserving local texture, TFAM jointly enhances spatial, channel, and coordinate attention, and MSFF with BiFPN achieves efficient cross-scale fusion. On VisDrone2019, our model reaches 52.7% mAP50 with 6.01M parameters and 148 FPS, and on HIT-UAV yields 85.2% mAP50 and 155 FPS, surpassing state-of-the-art UAV detectors in accuracy and efficiency. Visualization further verifies robustness under low-light, dense, and scale-varying UAV scenes.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"271-275"},"PeriodicalIF":3.9,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1109/LSP.2025.3636762
Weicao Deng;Sangwoo Park;Min Li;Osvaldo Simeone
Reliable uncertainty quantification is critical for trustworthy AI. Conformal Prediction (CP) provides prediction sets with distribution-free coverage guarantees, but its two main variants face complementary limitations. Split CP (SCP) suffers from data inefficiency due to dataset partitioning, while full CP (FCP) improves data efficiency at the cost of prohibitive retraining complexity. Recent approaches based on meta-learning or in-context learning (ICL) partially mitigate these drawbacks. However, they rely on training procedures not specifically tailored to CP, which may yield large prediction sets. We introduce an efficient FCP framework, termed enhanced ICL-based FCP (E-ICL+FCP), which employs a permutation-invariant Transformer-based ICL model trained with a CP-aware loss. By simulating the multiple retrained models required by FCP without actual retraining, E-ICL+FCP preserves coverage while markedly reducing both inefficiency and computational overhead. Experiments on synthetic and real tasks demonstrate that E-ICL+FCP attains superior efficiency-coverage trade-offs compared to existing SCP and FCP baselines.
{"title":"Optimizing In-Context Learning for Efficient Full Conformal Prediction","authors":"Weicao Deng;Sangwoo Park;Min Li;Osvaldo Simeone","doi":"10.1109/LSP.2025.3636762","DOIUrl":"https://doi.org/10.1109/LSP.2025.3636762","url":null,"abstract":"Reliable uncertainty quantification is critical for trustworthy AI. Conformal Prediction (CP) provides prediction sets with distribution-free coverage guarantees, but its two main variants face complementary limitations. Split CP (SCP) suffers from data inefficiency due to dataset partitioning, while full CP (FCP) improves data efficiency at the cost of prohibitive retraining complexity. Recent approaches based on meta-learning or in-context learning (ICL) partially mitigate these drawbacks. However, they rely on training procedures not specifically tailored to CP, which may yield large prediction sets. We introduce an efficient FCP framework, termed enhanced ICL-based FCP (E-ICL+FCP), which employs a permutation-invariant Transformer-based ICL model trained with a CP-aware loss. By simulating the multiple retrained models required by FCP without actual retraining, E-ICL+FCP preserves coverage while markedly reducing both inefficiency and computational overhead. Experiments on synthetic and real tasks demonstrate that E-ICL+FCP attains superior efficiency-coverage trade-offs compared to existing SCP and FCP baselines.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"311-315"},"PeriodicalIF":3.9,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Polycystic ovary syndrome (PCOS) not only causes anovulation in women but also severely affects their physical and mental health. Clinically, diagnostic delays often cause patients to miss optimal treatment windows. As a non-invasive detection technique, Raman spectroscopy has been used for screening this disease. In this letter, the Raman spectra of follicular fluid and plasma from women which PCOS are examined using a deep physics-informed neural network. The results demonstrate that by incorporating physical priors and integrating multi-domain spectral information, the proposed method achieves accuracies of 96.25$%$ in detecting PCOS from plasma samples and 90.00$%$ from follicular fluid samples.
{"title":"Study on an Intelligent Screening Method for Polycystic Ovary Syndrome Based on Deep PhysicsInformed Neural Network","authors":"Yu Gong;Danji Wang;Chao Wu;Man Ni;Shengli Li;Yang Liu;Ziyuan Shen;Zhidong Su;Xiaoxiao Liu;Huiping Zhou;Huijie Zhang","doi":"10.1109/LSP.2025.3636719","DOIUrl":"https://doi.org/10.1109/LSP.2025.3636719","url":null,"abstract":"Polycystic ovary syndrome (PCOS) not only causes anovulation in women but also severely affects their physical and mental health. Clinically, diagnostic delays often cause patients to miss optimal treatment windows. As a non-invasive detection technique, Raman spectroscopy has been used for screening this disease. In this letter, the Raman spectra of follicular fluid and plasma from women which PCOS are examined using a deep physics-informed neural network. The results demonstrate that by incorporating physical priors and integrating multi-domain spectral information, the proposed method achieves accuracies of 96.25<inline-formula><tex-math>$%$</tex-math></inline-formula> in detecting PCOS from plasma samples and 90.00<inline-formula><tex-math>$%$</tex-math></inline-formula> from follicular fluid samples.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"33 ","pages":"266-270"},"PeriodicalIF":3.9,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-21DOI: 10.1109/LSP.2025.3635004
Hao Wu;Fengjiao Gan;Xu Chen
This letter presents a wide field-of-view (FoV) millimeter-wave array synthetic aperture radar (SAR) imaging system based on curved linear array. The proposed system retains the low-cost advantage of planar scanning array SARs while offering a broader viewing angle. However, the significant disparity in spatial sampling density across different regions of the sampling aperture results in suboptimal imaging performance when employing the classical back-projection algorithm (BPA). To address this issue, we introduce a measurement-fusion imaging algorithm tailored for this system, which involves constructing uniformly sampled sub-apertures and calculating spatial grid weights. This approach significantly enhances image integrity and mitigates artifacts and sidelobes. Experiments demonstrate high-quality imaging with an extended FoV.
{"title":"Wide Field-of-View MMW SISO-SAR Image Reconstruction Based on Curved Linear Array","authors":"Hao Wu;Fengjiao Gan;Xu Chen","doi":"10.1109/LSP.2025.3635004","DOIUrl":"https://doi.org/10.1109/LSP.2025.3635004","url":null,"abstract":"This letter presents a wide field-of-view (FoV) millimeter-wave array synthetic aperture radar (SAR) imaging system based on curved linear array. The proposed system retains the low-cost advantage of planar scanning array SARs while offering a broader viewing angle. However, the significant disparity in spatial sampling density across different regions of the sampling aperture results in suboptimal imaging performance when employing the classical back-projection algorithm (BPA). To address this issue, we introduce a measurement-fusion imaging algorithm tailored for this system, which involves constructing uniformly sampled sub-apertures and calculating spatial grid weights. This approach significantly enhances image integrity and mitigates artifacts and sidelobes. Experiments demonstrate high-quality imaging with an extended FoV.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"4464-4468"},"PeriodicalIF":3.9,"publicationDate":"2025-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}