首页 > 最新文献

Signal Processing-Image Communication最新文献

英文 中文
UW-SDE: Multi-scale prompt feature guided diffusion model for underwater image enhancement UW-SDE:用于水下图像增强的多尺度提示特征引导扩散模型
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-14 DOI: 10.1016/j.image.2026.117486
Jiaxi Li, Junjun Wu, Qinghua Lu, Ningwei Qin, Shuhong Zhou, Weijian Li
In recent years, diffusion models have achieved remarkable performance in the field of image generation and have been widely applied, with their potential in image enhancement tasks gradually being unearthed. However, when applied to underwater scenes, diffusion models for general image restoration struggle to achieve their expected performance. This is due to the scattering and absorption of light in underwater environments, resulting in underwater images suffering from color distortion, low contrast, and haziness. These issues often co-occur within a single underwater image, making the task of underwater image enhancement more challenging than typical image enhancement tasks. To better adapt diffusion models for underwater image enhancement, this paper proposes an underwater image enhancement method based on latent diffusion model. The proposed model’s latent encoder progressively mitigates adverse degradation factors embedded within the hidden layers, while preserving essential image feature information in the latent representation, thus enabling a smoother diffusion process. Additionally, we design a gated fusion network that integrates guiding features at multiple scales, steering the network towards diffusion with superior visual quality restoration. A series of qualitative and quantitative experiments conducted on various real-world underwater image datasets demonstrate that our proposed method outperforms recent state-of-the-art methods in terms of visual effects and generalization capabilities, proving the effectiveness of our approach in applying diffusion model to underwater enhancement tasks.
近年来,扩散模型在图像生成领域取得了显著的成绩,得到了广泛的应用,其在图像增强任务中的潜力也逐渐被挖掘出来。然而,当应用于水下场景时,用于一般图像恢复的扩散模型很难达到预期的效果。这是由于光线在水下环境中的散射和吸收,导致水下图像遭受色彩失真,低对比度和模糊。这些问题往往同时出现在单个水下图像中,使得水下图像增强任务比典型的图像增强任务更具挑战性。为了更好地适应扩散模型对水下图像的增强,本文提出了一种基于潜扩散模型的水下图像增强方法。该模型的潜在编码器逐步减轻嵌入在隐藏层中的不利退化因素,同时在潜在表示中保留基本的图像特征信息,从而实现更平滑的扩散过程。此外,我们设计了一个门控融合网络,该网络集成了多个尺度的引导特征,使网络向扩散方向发展,并具有卓越的视觉质量恢复。在各种真实世界水下图像数据集上进行的一系列定性和定量实验表明,我们提出的方法在视觉效果和泛化能力方面优于最近最先进的方法,证明了我们的方法在将扩散模型应用于水下增强任务方面的有效性。
{"title":"UW-SDE: Multi-scale prompt feature guided diffusion model for underwater image enhancement","authors":"Jiaxi Li,&nbsp;Junjun Wu,&nbsp;Qinghua Lu,&nbsp;Ningwei Qin,&nbsp;Shuhong Zhou,&nbsp;Weijian Li","doi":"10.1016/j.image.2026.117486","DOIUrl":"10.1016/j.image.2026.117486","url":null,"abstract":"<div><div>In recent years, diffusion models have achieved remarkable performance in the field of image generation and have been widely applied, with their potential in image enhancement tasks gradually being unearthed. However, when applied to underwater scenes, diffusion models for general image restoration struggle to achieve their expected performance. This is due to the scattering and absorption of light in underwater environments, resulting in underwater images suffering from color distortion, low contrast, and haziness. These issues often co-occur within a single underwater image, making the task of underwater image enhancement more challenging than typical image enhancement tasks. To better adapt diffusion models for underwater image enhancement, this paper proposes an underwater image enhancement method based on latent diffusion model. The proposed model’s latent encoder progressively mitigates adverse degradation factors embedded within the hidden layers, while preserving essential image feature information in the latent representation, thus enabling a smoother diffusion process. Additionally, we design a gated fusion network that integrates guiding features at multiple scales, steering the network towards diffusion with superior visual quality restoration. A series of qualitative and quantitative experiments conducted on various real-world underwater image datasets demonstrate that our proposed method outperforms recent state-of-the-art methods in terms of visual effects and generalization capabilities, proving the effectiveness of our approach in applying diffusion model to underwater enhancement tasks.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117486"},"PeriodicalIF":2.7,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new baseline for edge detection: Make encoder–decoder great again 边缘检测的新基线:使编码器-解码器再次伟大
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-14 DOI: 10.1016/j.image.2026.117485
Yachuan Li , Xavier Soria Poma , Yongke Xi , Guanlin Li , Chaozhi Yang , Qian Xiao , Yun Bai , Zongmin Li
The performance of deep learning based edge detectors has surpassed human performance, but the huge computational cost and complex training strategies hinder their further development and application. In this paper, we alleviate these complexities with a vanilla encoder–decoder based detector. Firstly, we design a bilateral encoder to decouple the extraction process of spatial features and semantic features. As the spatial branch no longer guides the semantic branch, feature richness can be reduced, enabling a more compact model design. We propose a cascaded feature fusion decoder, where the spatial features are progressively refined by semantic features. The refined spatial features are the only basis for generating the edge map. The coarse original spatial features and semantic features are avoided from direct contact with the final result. So the noise in the spatial features and the location error in the semantic features can be suppressed in the generated edge map. The proposed New Baseline for Edge Detection (NBED) achieves superior performance consistently across multiple edge detection benchmarks, even compared with those methods with huge computational costs and complex training strategies. The ODS of NBED on BSDS500 is 0.838, achieving state-of-the-art performance. Our study highlights that high-quality features are key to modern edge detection, and encoder–decoder based detectors can achieve excellent performance without complex training or heavy computation. Furthermore, we take retinal vessel segmentation as an example to explore the application of NBED in downstream tasks. The code is available at https://github.com/Li-yachuan/NBED.
基于深度学习的边缘检测器的性能已经超越了人类的性能,但巨大的计算成本和复杂的训练策略阻碍了它们的进一步发展和应用。在本文中,我们使用基于普通编码器-解码器的检测器来减轻这些复杂性。首先,我们设计了一个双边编码器来解耦空间特征和语义特征的提取过程。由于空间分支不再引导语义分支,可以减少特征丰富度,使模型设计更加紧凑。我们提出了一种级联特征融合解码器,其中空间特征通过语义特征逐步细化。精细的空间特征是生成边缘图的唯一依据。避免了粗糙的原始空间特征和语义特征与最终结果的直接接触。因此,生成的边缘图可以有效地抑制空间特征中的噪声和语义特征中的位置误差。即使与那些具有巨大计算成本和复杂训练策略的方法相比,所提出的边缘检测新基线(NBED)在多个边缘检测基准上也具有一致的优越性能。NBED在BSDS500上的ODS为0.838,达到了最先进的性能。我们的研究强调了高质量的特征是现代边缘检测的关键,基于编码器-解码器的检测器可以在不需要复杂训练或大量计算的情况下获得出色的性能。此外,我们以视网膜血管分割为例,探讨NBED在下游任务中的应用。代码可在https://github.com/Li-yachuan/NBED上获得。
{"title":"A new baseline for edge detection: Make encoder–decoder great again","authors":"Yachuan Li ,&nbsp;Xavier Soria Poma ,&nbsp;Yongke Xi ,&nbsp;Guanlin Li ,&nbsp;Chaozhi Yang ,&nbsp;Qian Xiao ,&nbsp;Yun Bai ,&nbsp;Zongmin Li","doi":"10.1016/j.image.2026.117485","DOIUrl":"10.1016/j.image.2026.117485","url":null,"abstract":"<div><div>The performance of deep learning based edge detectors has surpassed human performance, but the huge computational cost and complex training strategies hinder their further development and application. In this paper, we alleviate these complexities with a vanilla encoder–decoder based detector. Firstly, we design a bilateral encoder to decouple the extraction process of spatial features and semantic features. As the spatial branch no longer guides the semantic branch, feature richness can be reduced, enabling a more compact model design. We propose a cascaded feature fusion decoder, where the spatial features are progressively refined by semantic features. The refined spatial features are the only basis for generating the edge map. The coarse original spatial features and semantic features are avoided from direct contact with the final result. So the noise in the spatial features and the location error in the semantic features can be suppressed in the generated edge map. The proposed New Baseline for Edge Detection (NBED) achieves superior performance consistently across multiple edge detection benchmarks, even compared with those methods with huge computational costs and complex training strategies. The ODS of NBED on BSDS500 is 0.838, achieving state-of-the-art performance. Our study highlights that high-quality features are key to modern edge detection, and encoder–decoder based detectors can achieve excellent performance without complex training or heavy computation. Furthermore, we take retinal vessel segmentation as an example to explore the application of NBED in downstream tasks. The code is available at <span><span>https://github.com/Li-yachuan/NBED</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117485"},"PeriodicalIF":2.7,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Secure vision: Integrated anti-spoofing and deep-fake detection system using knowledge distillation approach 安全视觉:采用知识蒸馏方法的集成防欺骗和深度假检测系统
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-09 DOI: 10.1016/j.image.2026.117481
K Jayashree , S Chakaravarthi , J Samyuktha , J Savitha , M Chaarulatha , A Yogeswari , G Samyuktha
The malicious users generate fake videos, images and images that spread misinformation, harass and blackmail the poor people. A wide variety of techniques, including the combining, merging, replacement, and imposition of photos and video recordings, are used to construct deepfakes. Moreover, the audio spoofing and calls are generated through deepfakes, which need specially trained models. Machine learning and deep learning are being rapidly improved, and a variety of techniques and tools are employed in the detection of deepfakes and anti-spoofing. The detection of both anti-spoofing and deepfakes is possible by resolving existing issues like generalizability, overfitting and complexity. To overcome these challenges, the knowledge distillation model is introduced in this paper. The process initiates with pre-processing using the weighted median filter (WmF). Here, the averaging intensity of neighboring pixels helps to smooth out variations. After that, the feature extraction is carried out by Dual attention based dilated ResNeXT with Residual autoencoder (DAD-DRAE). The model provides features with fewer dimensionality. In the classification phase, models like the Optimized Multi-task Transformer induced Relational knowledge distillation model (OMT-RKD) are deployed to categorize distinct classes of Anti-Spoofing and Deepfake Detection. The hyperparameter used in the classification model is tuned by the Tent chaotic Hippo optimization algorithm (TCHOA). The chaotic function increases the convergence, which decreases the model parameter complexity. In the evaluation, the proposed model is trained with three datasets and achieved an accuracy of 98.68%, 98.22% and 98.44% in the Deepfake Detection Challenge (DFDC) dataset, ASVspoof dataset and FaceForensics++, respectively.
恶意用户制作虚假视频、图片和图片,传播错误信息,骚扰和勒索穷人。各种各样的技术,包括组合、合并、替换和强加的照片和视频记录,被用来构建深度伪造。此外,音频欺骗和呼叫是通过深度伪造产生的,这需要经过专门训练的模型。机器学习和深度学习正在迅速发展,各种技术和工具被用于检测深度伪造和反欺骗。通过解决现有的问题,如泛化、过拟合和复杂性,可以检测反欺骗和深度伪造。为了克服这些挑战,本文引入了知识蒸馏模型。该过程开始使用加权中值滤波器(WmF)进行预处理。在这里,相邻像素的平均强度有助于平滑变化。然后,使用基于双注意的扩展ResNeXT与残差自编码器(DAD-DRAE)进行特征提取。该模型提供维度更少的特征。在分类阶段,部署了优化的多任务变压器诱导关系知识蒸馏模型(OMT-RKD)等模型来对Anti-Spoofing和Deepfake Detection进行不同的分类。分类模型中使用的超参数通过Tent混沌Hippo优化算法(TCHOA)进行调整。混沌函数提高了收敛性,降低了模型参数复杂度。在评估中,本文提出的模型在三个数据集上进行了训练,在Deepfake Detection Challenge (DFDC)数据集、ASVspoof数据集和face取证++中分别达到了98.68%、98.22%和98.44%的准确率。
{"title":"Secure vision: Integrated anti-spoofing and deep-fake detection system using knowledge distillation approach","authors":"K Jayashree ,&nbsp;S Chakaravarthi ,&nbsp;J Samyuktha ,&nbsp;J Savitha ,&nbsp;M Chaarulatha ,&nbsp;A Yogeswari ,&nbsp;G Samyuktha","doi":"10.1016/j.image.2026.117481","DOIUrl":"10.1016/j.image.2026.117481","url":null,"abstract":"<div><div>The malicious users generate fake videos, images and images that spread misinformation, harass and blackmail the poor people. A wide variety of techniques, including the combining, merging, replacement, and imposition of photos and video recordings, are used to construct deepfakes. Moreover, the audio spoofing and calls are generated through deepfakes, which need specially trained models. Machine learning and deep learning are being rapidly improved, and a variety of techniques and tools are employed in the detection of deepfakes and anti-spoofing. The detection of both anti-spoofing and deepfakes is possible by resolving existing issues like generalizability, overfitting and complexity. To overcome these challenges, the knowledge distillation model is introduced in this paper. The process initiates with pre-processing using the weighted median filter (WmF). Here, the averaging intensity of neighboring pixels helps to smooth out variations. After that, the feature extraction is carried out by Dual attention based dilated ResNeXT with Residual autoencoder (DAD-DRAE). The model provides features with fewer dimensionality. In the classification phase, models like the Optimized Multi-task Transformer induced Relational knowledge distillation model (OMT-RKD) are deployed to categorize distinct classes of Anti-Spoofing and Deepfake Detection. The hyperparameter used in the classification model is tuned by the Tent chaotic Hippo optimization algorithm (TCHOA). The chaotic function increases the convergence, which decreases the model parameter complexity. In the evaluation, the proposed model is trained with three datasets and achieved an accuracy of 98.68%, 98.22% and 98.44% in the Deepfake Detection Challenge (DFDC) dataset, ASVspoof dataset and FaceForensics++, respectively.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117481"},"PeriodicalIF":2.7,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust coverless image steganography based on ring features and DWT sequence mapping 基于环特征和DWT序列映射的鲁棒无覆盖图像隐写
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-09 DOI: 10.1016/j.image.2026.117482
Chen-Yi Lin, Su-Ho Chiu
The widespread adoption of the Internet has enhanced communication between individuals but increased the risk of secret messages being intercepted, thereby drawing public attention to the security of message transmission. Image steganography has been a prominent area of research within the field secure communication technologies. However, traditional image steganography techniques risk being compromised by steganalysis tools, leading researchers to propose the concept of coverless image steganography. In recent years, numerous coverless image steganography techniques have been developed that effectively resist steganalysis tools. However, these techniques commonly suffer from incomplete mapping of secret messages, rendering them incapable of successfully concealing the information. Furthermore, most existing coverless steganography techniques rely on cryptographic methods to protect auxiliary information, which may raise suspicion and result in interception, thereby preventing the receiver from correctly recovering the secret messages. To address these issues, this study proposes a novel coverless image steganography technique based on ring features and discrete wavelet transform sequence mapping. This method generates feature sequences from both the spatial and frequency domains of images and employs an innovative stego image collage mechanism to transmit auxiliary information, thereby reducing the risk of interception. Experimental results demonstrate that the proposed technique significantly enhances the richness of feature sequences and the completeness of message mapping, achieving a 100 % success rate on medium- and large-scale image datasets. Moreover, the proposed method exhibits superior robustness even under conditions where existing techniques suffer from low mapping success rates or prolonged mapping times.
互联网的广泛采用加强了个人之间的交流,但也增加了秘密信息被截获的风险,从而引起公众对信息传输安全性的关注。图像隐写术一直是安全通信技术领域的一个重要研究领域。然而,传统的图像隐写技术有被隐写分析工具破坏的风险,导致研究人员提出了无覆盖图像隐写的概念。近年来,许多无覆盖图像隐写技术已经被开发出来,可以有效地抵抗隐写分析工具。然而,这些技术通常存在秘密消息映射不完整的问题,导致它们无法成功隐藏信息。此外,大多数现有的无覆盖隐写技术依赖于加密方法来保护辅助信息,这可能引起怀疑并导致拦截,从而使接收者无法正确恢复秘密信息。为了解决这些问题,本研究提出了一种基于环特征和离散小波变换序列映射的无覆盖图像隐写技术。该方法从图像的空间域和频域生成特征序列,并采用创新的隐写图像拼贴机制传输辅助信息,从而降低了被截获的风险。实验结果表明,该方法显著提高了特征序列的丰富度和消息映射的完备性,在大中型图像数据集上实现了100%的成功率。此外,即使在现有技术存在低映射成功率或长映射时间的情况下,所提出的方法也表现出优异的鲁棒性。
{"title":"Robust coverless image steganography based on ring features and DWT sequence mapping","authors":"Chen-Yi Lin,&nbsp;Su-Ho Chiu","doi":"10.1016/j.image.2026.117482","DOIUrl":"10.1016/j.image.2026.117482","url":null,"abstract":"<div><div>The widespread adoption of the Internet has enhanced communication between individuals but increased the risk of secret messages being intercepted, thereby drawing public attention to the security of message transmission. Image steganography has been a prominent area of research within the field secure communication technologies. However, traditional image steganography techniques risk being compromised by steganalysis tools, leading researchers to propose the concept of coverless image steganography. In recent years, numerous coverless image steganography techniques have been developed that effectively resist steganalysis tools. However, these techniques commonly suffer from incomplete mapping of secret messages, rendering them incapable of successfully concealing the information. Furthermore, most existing coverless steganography techniques rely on cryptographic methods to protect auxiliary information, which may raise suspicion and result in interception, thereby preventing the receiver from correctly recovering the secret messages. To address these issues, this study proposes a novel coverless image steganography technique based on ring features and discrete wavelet transform sequence mapping. This method generates feature sequences from both the spatial and frequency domains of images and employs an innovative stego image collage mechanism to transmit auxiliary information, thereby reducing the risk of interception. Experimental results demonstrate that the proposed technique significantly enhances the richness of feature sequences and the completeness of message mapping, achieving a 100 % success rate on medium- and large-scale image datasets. Moreover, the proposed method exhibits superior robustness even under conditions where existing techniques suffer from low mapping success rates or prolonged mapping times.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117482"},"PeriodicalIF":2.7,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learned lossless medical image compression via dual transform and subimage-wise auto-regression 通过对偶变换和子图像自回归学习了医学图像的无损压缩
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-08 DOI: 10.1016/j.image.2025.117455
Tiantian Li , Yue Li , Ruixiao Guo , Gaobo Yang
While learning-based compression has achieved significant success for natural images, its application to medical imaging requires specialized approaches to ensure zero information loss. In this work, we propose a novel lossless compression framework for 2D medical images, concentrated on two objectives: entropy reduction and precise probability estimation. We design a dual-transform module, namely DCT and a reversible linear map decomposition, which transforms image pixels into low-entropy representations — structural and textural components in integer DCT domain. A novel row-column grouping strategy is designed to decompose textural components into subimages. Unlike the existing parametric models, DCT-LSBNet models the non-parametric probability estimation for each subimage in an auto-regression way, avoiding the bias hypothesis of probability distribution and balancing well between compression rate and encoding latency. Extensive experimental results show that compared with the existing traditional and learned lossless compression, our method achieves state-of-the-art performance for X-ray images and ultrasound images.
虽然基于学习的压缩在自然图像上取得了显著的成功,但将其应用于医学成像需要专门的方法来确保零信息损失。在这项工作中,我们提出了一种新的二维医学图像无损压缩框架,集中在两个目标:熵减少和精确概率估计。我们设计了一个双变换模块,即DCT和可逆线性映射分解,将图像像素在整数DCT域中转换为低熵表示-结构和纹理分量。设计了一种新的行-列分组策略,将纹理组件分解成子图像。与现有的参数化模型不同,DCT-LSBNet以自回归的方式对每个子图像的非参数概率估计进行建模,避免了概率分布的偏差假设,很好地平衡了压缩率和编码延迟。大量的实验结果表明,与现有的传统和学习的无损压缩相比,我们的方法在x射线图像和超声图像上达到了最先进的性能。
{"title":"Learned lossless medical image compression via dual transform and subimage-wise auto-regression","authors":"Tiantian Li ,&nbsp;Yue Li ,&nbsp;Ruixiao Guo ,&nbsp;Gaobo Yang","doi":"10.1016/j.image.2025.117455","DOIUrl":"10.1016/j.image.2025.117455","url":null,"abstract":"<div><div>While learning-based compression has achieved significant success for natural images, its application to medical imaging requires specialized approaches to ensure zero information loss. In this work, we propose a novel lossless compression framework for 2D medical images, concentrated on two objectives: entropy reduction and precise probability estimation. We design a dual-transform module, namely DCT and a reversible linear map decomposition, which transforms image pixels into low-entropy representations — structural and textural components in integer DCT domain. A novel row-column grouping strategy is designed to decompose textural components into subimages. Unlike the existing parametric models, DCT-LSBNet models the non-parametric probability estimation for each subimage in an auto-regression way, avoiding the bias hypothesis of probability distribution and balancing well between compression rate and encoding latency. Extensive experimental results show that compared with the existing traditional and learned lossless compression, our method achieves state-of-the-art performance for X-ray images and ultrasound images.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117455"},"PeriodicalIF":2.7,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Micro-expression recognition based on dataset balance and local connected bi-branch network 基于数据集平衡和本地连接双分支网络的微表情识别
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-06 DOI: 10.1016/j.image.2026.117480
Hanpu Wang, Fuyuan Luo, Ju Zhou, Xinyu Liu, Haolin Xia, Tong Chen
Micro-expressions are subtle and transient facial movements that reveal underlying human emotions, and they hold significant research and application value in fields such as public safety, criminal investigation, and clinical diagnosis. However, due to their fleeting duration and low intensity, existing micro-expression datasets are limited in size and suffer from severe class imbalance, which poses great challenges for reliable recognition. In this paper, we propose an assessment-based re-sampling (ASR) strategy to augment micro-expression data and alleviate category imbalance. Specifically, we first employ semi-supervised self-training on the original dataset to learn an assessment model with both high accuracy and high recall. This model is then used to evaluate frames in micro-expression video sequences (excluding those in the training set). The non-apex frames identified through this assessment are subsequently selected to directly expand the underrepresented classes. Furthermore, we design a locally connected bi-branch network (LCB) for micro-expression recognition. In this network, the high-frequency components of micro-expression frames are extracted to capture weak facial muscle movements and combined with global information as complementary input. We conduct extensive experiments on three benchmark datasets, CASME, CASME II, and SAMM. The results demonstrate that our method is both effective and competitive, achieving an accuracy of 90.23% on the SAMM dataset.
微表情是一种揭示人类潜在情感的微妙、短暂的面部动作,在公共安全、刑事侦查、临床诊断等领域具有重要的研究和应用价值。然而,由于微表情数据持续时间短、强度低,现有微表情数据集规模有限,并且存在严重的类不平衡,这给可靠识别带来了很大的挑战。在本文中,我们提出了一种基于评估的重采样(ASR)策略来增加微表情数据并缓解类别失衡。具体而言,我们首先在原始数据集上使用半监督自训练来学习具有高准确率和高召回率的评估模型。然后使用该模型评估微表情视频序列中的帧(不包括训练集中的帧)。通过评估确定的非顶点框架随后被选择直接扩展代表性不足的类别。此外,我们设计了一个局部连接的双分支网络(LCB)用于微表情识别。在该网络中,提取微表情帧的高频分量来捕捉面部肌肉的弱运动,并与全局信息相结合作为补充输入。我们在三个基准数据集CASME、CASME II和SAMM上进行了广泛的实验。结果表明,我们的方法是有效的和有竞争力的,在SAMM数据集上达到了90.23%的准确率。
{"title":"Micro-expression recognition based on dataset balance and local connected bi-branch network","authors":"Hanpu Wang,&nbsp;Fuyuan Luo,&nbsp;Ju Zhou,&nbsp;Xinyu Liu,&nbsp;Haolin Xia,&nbsp;Tong Chen","doi":"10.1016/j.image.2026.117480","DOIUrl":"10.1016/j.image.2026.117480","url":null,"abstract":"<div><div>Micro-expressions are subtle and transient facial movements that reveal underlying human emotions, and they hold significant research and application value in fields such as public safety, criminal investigation, and clinical diagnosis. However, due to their fleeting duration and low intensity, existing micro-expression datasets are limited in size and suffer from severe class imbalance, which poses great challenges for reliable recognition. In this paper, we propose an assessment-based re-sampling (ASR) strategy to augment micro-expression data and alleviate category imbalance. Specifically, we first employ semi-supervised self-training on the original dataset to learn an assessment model with both high accuracy and high recall. This model is then used to evaluate frames in micro-expression video sequences (excluding those in the training set). The non-apex frames identified through this assessment are subsequently selected to directly expand the underrepresented classes. Furthermore, we design a locally connected bi-branch network (LCB) for micro-expression recognition. In this network, the high-frequency components of micro-expression frames are extracted to capture weak facial muscle movements and combined with global information as complementary input. We conduct extensive experiments on three benchmark datasets, CASME, CASME II, and SAMM. The results demonstrate that our method is both effective and competitive, achieving an accuracy of 90.23% on the SAMM dataset.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117480"},"PeriodicalIF":2.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UIQA-MSST: Multi-Scale Staircase-Transformer Fusion for Underwater Image Quality Assessment 多尺度阶梯-变压器融合水下图像质量评估
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-06 DOI: 10.1016/j.image.2026.117479
Tianhai Chen , Xichen Yang , Tianshu Wang , Shun Zhu , Yan Zhang , Zhongyuan Mao , Nengxin Li
Underwater images play a crucial role in underwater exploration and resource development, but their quality often degrades in complex underwater scenarios. However, existing methods mainly focus on specific scenarios, and exhibit limited generalization ability when addressing complex underwater scenarios. Enhancing their applicability is therefore essential for accurate quality assessment of underwater images across diverse scenarios. This paper proposes an Underwater Image Quality Assessment (UIQA) method that combines the advantages of staircase network and Transformer, focusing on efficiently capturing and integrating image features at different scales. Initially, multi-scale feature extraction is performed to obtain information from images at various levels. Following this, a Staircase Feature (SF) module progressively integrates features from shallow to deep layers, achieving fusion of cross-scale information. Additionally, the Cross-Scale Transformer (CST) module effectively merges information from multiple scales using self-attention mechanisms. By concatenating the output features of both modules, the model gains an understanding of image content across global and local ranges. Subsequently, a regression module is utilized to generate quality scores. Finally, meta-learning optimizes the model’s learning process, enabling adaptation to new data for accurate image quality prediction across diverse scenarios. Experiments show superior accuracy and stability on underwater datasets, with additional tests on natural scenes that demonstrate broader applicability. Cross-dataset experiments validate the generalization capability of the proposed method. The source code will be made available at https://github.com/dart-into/UIQA-MSST.
水下图像在水下勘探和资源开发中发挥着至关重要的作用,但在复杂的水下场景下,图像质量往往会下降。然而,现有的方法主要集中在特定的场景上,在处理复杂的水下场景时,泛化能力有限。因此,提高其适用性对于在不同场景下准确评估水下图像的质量至关重要。本文提出了一种结合阶梯网络和Transformer优点的水下图像质量评估(UIQA)方法,重点关注不同尺度下图像特征的高效捕获和整合。首先进行多尺度特征提取,从不同层次的图像中获取信息。随后,楼梯特征(SF)模块从浅层到深层逐步整合特征,实现跨尺度信息融合。此外,跨尺度变压器(CST)模块使用自关注机制有效地合并来自多个尺度的信息。通过连接两个模块的输出特征,该模型可以理解全局和局部范围内的图像内容。随后,利用回归模块生成质量分数。最后,元学习优化了模型的学习过程,使其能够适应新数据,从而在不同场景下进行准确的图像质量预测。实验表明,在水下数据集上具有卓越的准确性和稳定性,在自然场景上进行的额外测试显示了更广泛的适用性。跨数据集实验验证了该方法的泛化能力。源代码将在https://github.com/dart-into/UIQA-MSST上提供。
{"title":"UIQA-MSST: Multi-Scale Staircase-Transformer Fusion for Underwater Image Quality Assessment","authors":"Tianhai Chen ,&nbsp;Xichen Yang ,&nbsp;Tianshu Wang ,&nbsp;Shun Zhu ,&nbsp;Yan Zhang ,&nbsp;Zhongyuan Mao ,&nbsp;Nengxin Li","doi":"10.1016/j.image.2026.117479","DOIUrl":"10.1016/j.image.2026.117479","url":null,"abstract":"<div><div>Underwater images play a crucial role in underwater exploration and resource development, but their quality often degrades in complex underwater scenarios. However, existing methods mainly focus on specific scenarios, and exhibit limited generalization ability when addressing complex underwater scenarios. Enhancing their applicability is therefore essential for accurate quality assessment of underwater images across diverse scenarios. This paper proposes an Underwater Image Quality Assessment (UIQA) method that combines the advantages of staircase network and Transformer, focusing on efficiently capturing and integrating image features at different scales. Initially, multi-scale feature extraction is performed to obtain information from images at various levels. Following this, a Staircase Feature (SF) module progressively integrates features from shallow to deep layers, achieving fusion of cross-scale information. Additionally, the Cross-Scale Transformer (CST) module effectively merges information from multiple scales using self-attention mechanisms. By concatenating the output features of both modules, the model gains an understanding of image content across global and local ranges. Subsequently, a regression module is utilized to generate quality scores. Finally, meta-learning optimizes the model’s learning process, enabling adaptation to new data for accurate image quality prediction across diverse scenarios. Experiments show superior accuracy and stability on underwater datasets, with additional tests on natural scenes that demonstrate broader applicability. Cross-dataset experiments validate the generalization capability of the proposed method. The source code will be made available at <span><span>https://github.com/dart-into/UIQA-MSST</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117479"},"PeriodicalIF":2.7,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-based person search via fine-grained cross-modal semantic alignment 通过细粒度跨模态语义对齐的基于文本的人员搜索
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2026-01-05 DOI: 10.1016/j.image.2026.117478
Feng Chen , Jielong He , Yang Liu , Xiwen Qu
Existing text-based person search methods face challenges in handling complex cross-modal interactions, often failing to capture subtle semantic nuances. To address this, we propose a novel Fine-grained Cross-modal Semantic Alignment (FCSA) framework that enhances accuracy and robustness in text-based person search. FCSA introduces two key components: the Cross-Modal Reconstruction Strategy (CMRS) and the Saliency-Guided Masking Mechanism (SGMM). CMRS facilitates feature alignment by leveraging incomplete visual and textual features, promoting bidirectional reasoning across modalities, and enhancing fine-grained semantic understanding. SGMM further refines performance by dynamically focusing on salient visual patches and critical text tokens, thereby improving discriminative region perception and image–text matching precision. Our approach outperforms existing state-of-the-art methods, achieving mean Average Precision (mAP) scores of 69.72%, 43.78% and 48.78% on CUHK-PEDES, ICFG-PEDES, and RSTPReid, respectively. Source code is at https://github.com/flychen321/FCSA.
现有的基于文本的人物搜索方法在处理复杂的跨模态交互时面临挑战,往往无法捕捉细微的语义差别。为了解决这个问题,我们提出了一种新的细粒度跨模态语义对齐(FCSA)框架,该框架提高了基于文本的人物搜索的准确性和鲁棒性。FCSA引入了两个关键组件:跨模态重建策略(CMRS)和显著性引导掩蔽机制(SGMM)。CMRS通过利用不完整的视觉和文本特征、促进跨模态的双向推理和增强细粒度语义理解来促进特征对齐。SGMM通过动态关注显著的视觉斑块和关键文本标记进一步改进性能,从而提高区分区域感知和图像-文本匹配精度。我们的方法优于现有的最先进的方法,在CUHK-PEDES, ICFG-PEDES和RSTPReid上分别获得了69.72%,43.78%和48.78%的平均精度(mAP)分数。源代码在https://github.com/flychen321/FCSA。
{"title":"Text-based person search via fine-grained cross-modal semantic alignment","authors":"Feng Chen ,&nbsp;Jielong He ,&nbsp;Yang Liu ,&nbsp;Xiwen Qu","doi":"10.1016/j.image.2026.117478","DOIUrl":"10.1016/j.image.2026.117478","url":null,"abstract":"<div><div>Existing text-based person search methods face challenges in handling complex cross-modal interactions, often failing to capture subtle semantic nuances. To address this, we propose a novel Fine-grained Cross-modal Semantic Alignment (FCSA) framework that enhances accuracy and robustness in text-based person search. FCSA introduces two key components: the Cross-Modal Reconstruction Strategy (CMRS) and the Saliency-Guided Masking Mechanism (SGMM). CMRS facilitates feature alignment by leveraging incomplete visual and textual features, promoting bidirectional reasoning across modalities, and enhancing fine-grained semantic understanding. SGMM further refines performance by dynamically focusing on salient visual patches and critical text tokens, thereby improving discriminative region perception and image–text matching precision. Our approach outperforms existing state-of-the-art methods, achieving mean Average Precision (mAP) scores of 69.72%, 43.78% and 48.78% on CUHK-PEDES, ICFG-PEDES, and RSTPReid, respectively. Source code is at <span><span>https://github.com/flychen321/FCSA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117478"},"PeriodicalIF":2.7,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GT-MilliNoise: Graph transformer for point-wise denoising of indoor millimetre-wave point clouds GT-MilliNoise:用于室内毫米波点云点向去噪的图形转换器
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-31 DOI: 10.1016/j.image.2025.117453
Walter Brescia , Pedro Gomes , Laura Toni , Saverio Mascolo , Luca De Cicco
Millimetre-wave (mmWave) radars are gaining popularity thanks to their low cost and robustness in low-visibility conditions. However, the 3D point clouds they produce are sparser and noisier than those from LiDARs and depth cameras. These differences create challenges when applying existing methods, originally designed for dense point clouds, to mmWave data. Specifically, there is a gap in point-level precision tasks, such as full point cloud denoising for mmWave data, partly due to the lack of fully annotated datasets. In this work, we employ the MilliNoise dataset, a fully annotated indoor mmWave point clouds dataset, to advance the understanding of mmWave point clouds denoising via two main steps: (i) we carry out an experimental analysis of the most common point cloud processing approaches and show their limitations in exploring the local-to-global structures in sparse and noisy point clouds; (ii) in light of the identified limitations, we propose a graph-based transformer architecture, denoted as GT-MilliNoise, composed of two main blocks to effectively leverage both the temporal and geometric structures of the data: a Temporal block leverages the sparsity of data to learn the dynamic behaviour of the points; a Geometric block, uses a point-wise attention mechanism to form representative neighbourhoods for feature extraction. The experimental results obtained in the MilliNoise dataset show that our proposed GT-MilliNoise architecture outperforms the state-of-the-art both qualitatively and quantitatively. Specifically, it achieves 75% accuracy (5% gain compared to the state-of-the-art), and a significantly low Earth Mover’s distance value of 0.193.
毫米波(mmWave)雷达由于其低成本和在低能见度条件下的坚固性而越来越受欢迎。然而,与激光雷达和深度相机相比,它们产生的3D点云更稀疏,噪音更大。这些差异在将现有方法应用于毫米波数据时带来了挑战,这些方法最初是为密集点云设计的。具体来说,在点级精度任务中存在差距,例如毫米波数据的全点云去噪,部分原因是缺乏完全注释的数据集。在这项工作中,我们使用了MilliNoise数据集(一个完全注释的室内毫米波点云数据集),通过两个主要步骤来推进对毫米波点云去噪的理解:(i)我们对最常见的点云处理方法进行了实验分析,并展示了它们在探索稀疏和噪声点云中的局域到全局结构方面的局限性;(ii)鉴于已确定的局限性,我们提出了一种基于图的转换器架构,表示为GT-MilliNoise,由两个主要块组成,以有效地利用数据的时间和几何结构:时间块利用数据的稀疏性来学习点的动态行为;一个几何块,使用逐点注意机制形成具有代表性的邻域进行特征提取。在MilliNoise数据集中获得的实验结果表明,我们提出的GT-MilliNoise架构在定性和定量上都优于最先进的技术。具体来说,它达到了75%的精度(与最先进的技术相比增加了5%),并且地球移动器的距离值显着降低为0.193。
{"title":"GT-MilliNoise: Graph transformer for point-wise denoising of indoor millimetre-wave point clouds","authors":"Walter Brescia ,&nbsp;Pedro Gomes ,&nbsp;Laura Toni ,&nbsp;Saverio Mascolo ,&nbsp;Luca De Cicco","doi":"10.1016/j.image.2025.117453","DOIUrl":"10.1016/j.image.2025.117453","url":null,"abstract":"<div><div>Millimetre-wave (mmWave) radars are gaining popularity thanks to their low cost and robustness in low-visibility conditions. However, the 3D point clouds they produce are sparser and noisier than those from LiDARs and depth cameras. These differences create challenges when applying existing methods, originally designed for dense point clouds, to mmWave data. Specifically, there is a gap in point-level precision tasks, such as full point cloud denoising for mmWave data, partly due to the lack of fully annotated datasets. In this work, we employ the MilliNoise dataset, a fully annotated indoor mmWave point clouds dataset, to advance the understanding of mmWave point clouds denoising via two main steps: (i) we carry out an experimental analysis of the most common point cloud processing approaches and show their limitations in exploring the local-to-global structures in sparse and noisy point clouds; (ii) in light of the identified limitations, we propose a graph-based transformer architecture, denoted as GT-<em>MilliNoise</em>, composed of two main blocks to effectively leverage both the temporal and geometric structures of the data: a <em>Temporal</em> block leverages the sparsity of data to learn the dynamic behaviour of the points; a <em>Geometric</em> block, uses a point-wise attention mechanism to form representative neighbourhoods for feature extraction. The experimental results obtained in the MilliNoise dataset show that our proposed GT-<em>MilliNoise</em> architecture outperforms the state-of-the-art both qualitatively and quantitatively. Specifically, it achieves 75% accuracy (5% gain compared to the state-of-the-art), and a significantly low Earth Mover’s distance value of 0.193.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117453"},"PeriodicalIF":2.7,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146023540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MAPLE: Combination of multiple angles of view and enhanced pseudo-label generation for unsupervised person re-identification MAPLE:多视角和增强伪标签生成的组合,用于无监督人员再识别
IF 2.7 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-30 DOI: 10.1016/j.image.2025.117462
Mai T. Do , Anh D. Nguyen
The primary objective of person ReID is to identify a specific individual among various surveillance cameras. Although early studies focused on supervised ReID using deep learning models, the application to real-world scenarios has highlighted challenges, such as large data volumes and increased manual labeling costs. This has led to a surge in interest in unsupervised ReID techniques, which utilize unlabeled data. Unsupervised person ReID methods can be classified into Unsupervised Domain Adaptation (UDA) ReID approaches and fully Unsupervised Learning (USL) ReID approaches. While UDA ReID leverages knowledge transfer from a source to a target domain, it can suffer from limitations due to domain discrepancies. In contrast, USL ReID relies solely on unlabeled datasets, offering flexibility and scalability but grappling with challenges related to feature representation and pseudo-labeling accuracy. This research introduces MAPLE to address these challenges. Our contributions include a novel strategy to integrate local region information with global features called Multi-Angles of View, an improved approach to unsupervised clustering using the DBSCAN method, and the integration of domain adaptation to bolster unsupervised learning. Extensive experiments on benchmarks such as Market-1501 and MSMT17 demonstrate our method’s superior performance compared to some state-of-the-art achievements, confirming its practical potential.
个人身份识别的主要目的是在各种监控摄像机中识别特定的个人。尽管早期的研究主要集中在使用深度学习模型的监督式ReID上,但将其应用到现实场景中也存在一些挑战,比如数据量大、人工标记成本增加。这导致了对利用未标记数据的无监督ReID技术的兴趣激增。无监督人ReID方法可以分为无监督域自适应(UDA) ReID方法和完全无监督学习(USL) ReID方法。虽然UDA ReID利用了从源到目标领域的知识转移,但由于领域差异,它可能会受到限制。相比之下,USL ReID仅依赖于未标记的数据集,提供灵活性和可扩展性,但努力解决与特征表示和伪标记准确性相关的挑战。本研究引入MAPLE来解决这些挑战。我们的贡献包括一种将局部区域信息与全局特征集成的新策略,称为多角度视图,一种使用DBSCAN方法的改进的无监督聚类方法,以及集成域自适应以支持无监督学习。在Market-1501和MSMT17等基准上进行的大量实验表明,与一些最先进的成果相比,我们的方法性能优越,证实了其实际应用潜力。
{"title":"MAPLE: Combination of multiple angles of view and enhanced pseudo-label generation for unsupervised person re-identification","authors":"Mai T. Do ,&nbsp;Anh D. Nguyen","doi":"10.1016/j.image.2025.117462","DOIUrl":"10.1016/j.image.2025.117462","url":null,"abstract":"<div><div>The primary objective of person ReID is to identify a specific individual among various surveillance cameras. Although early studies focused on supervised ReID using deep learning models, the application to real-world scenarios has highlighted challenges, such as large data volumes and increased manual labeling costs. This has led to a surge in interest in unsupervised ReID techniques, which utilize unlabeled data. Unsupervised person ReID methods can be classified into Unsupervised Domain Adaptation (UDA) ReID approaches and fully Unsupervised Learning (USL) ReID approaches. While UDA ReID leverages knowledge transfer from a source to a target domain, it can suffer from limitations due to domain discrepancies. In contrast, USL ReID relies solely on unlabeled datasets, offering flexibility and scalability but grappling with challenges related to feature representation and pseudo-labeling accuracy. This research introduces MAPLE to address these challenges. Our contributions include a novel strategy to integrate local region information with global features called Multi-Angles of View, an improved approach to unsupervised clustering using the DBSCAN method, and the integration of domain adaptation to bolster unsupervised learning. Extensive experiments on benchmarks such as Market-1501 and MSMT17 demonstrate our method’s superior performance compared to some state-of-the-art achievements, confirming its practical potential.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"142 ","pages":"Article 117462"},"PeriodicalIF":2.7,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145927660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Signal Processing-Image Communication
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1