Journal of Electronic Imaging最新文献

英文中文

No-reference video quality assessment based on human visual perception 基于人类视觉感知的无参照视频质量评估

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043029

Zhou Zhou, Guangqian Kong, Xun Duan, Huiyun Long

Conducting video quality assessment (VQA) for user-generated content (UGC) videos and achieving consistency with subjective quality assessment are highly challenging tasks. We propose a no-reference video quality assessment (NR-VQA) method for UGC scenarios by considering characteristics of human visual perception. To distinguish between varying levels of human attention within different regions of a single frame, we devise a dual-branch network. This network extracts spatial features containing positional information of moving objects from frame-level images. In addition, we employ the temporal pyramid pooling module to effectively integrate temporal features of different scales, enabling the extraction of inter-frame temporal information. To mitigate the time-lag effect in the human visual system, we introduce the temporal pyramid attention module. This module evaluates the significance of individual video frames and simulates the varying attention levels exhibited by humans towards frames. We conducted experiments on the KoNViD-1k, LIVE-VQC, CVD2014, and YouTube-UGC databases. The experimental results demonstrate the superior performance of our proposed method compared to recent NR-VQA techniques in terms of both objective assessment and consistency with subjective assessment.

对用户生成内容（UGC）视频进行视频质量评估（VQA）并实现与主观质量评估的一致性是极具挑战性的任务。考虑到人类视觉感知的特点，我们提出了一种针对 UGC 场景的无参考视频质量评估（NR-VQA）方法。为了区分单帧不同区域内人类注意力的不同水平，我们设计了一个双分支网络。该网络从帧级图像中提取包含移动物体位置信息的空间特征。此外，我们还采用了时间金字塔池化模块来有效整合不同尺度的时间特征，从而提取帧间的时间信息。为了缓解人类视觉系统中的时滞效应，我们引入了时空金字塔注意力模块。该模块可评估单个视频帧的重要性，并模拟人类对帧所表现出的不同注意力水平。我们在 KoNViD-1k、LIVE-VQC、CVD2014 和 YouTube-UGC 数据库上进行了实验。实验结果表明，与最新的 NR-VQA 技术相比，我们提出的方法在客观评估和与主观评估的一致性方面都表现出色。

{"title":"No-reference video quality assessment based on human visual perception","authors":"Zhou Zhou, Guangqian Kong, Xun Duan, Huiyun Long","doi":"10.1117/1.jei.33.4.043029","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043029","url":null,"abstract":"Conducting video quality assessment (VQA) for user-generated content (UGC) videos and achieving consistency with subjective quality assessment are highly challenging tasks. We propose a no-reference video quality assessment (NR-VQA) method for UGC scenarios by considering characteristics of human visual perception. To distinguish between varying levels of human attention within different regions of a single frame, we devise a dual-branch network. This network extracts spatial features containing positional information of moving objects from frame-level images. In addition, we employ the temporal pyramid pooling module to effectively integrate temporal features of different scales, enabling the extraction of inter-frame temporal information. To mitigate the time-lag effect in the human visual system, we introduce the temporal pyramid attention module. This module evaluates the significance of individual video frames and simulates the varying attention levels exhibited by humans towards frames. We conducted experiments on the KoNViD-1k, LIVE-VQC, CVD2014, and YouTube-UGC databases. The experimental results demonstrate the superior performance of our proposed method compared to recent NR-VQA techniques in terms of both objective assessment and consistency with subjective assessment.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"16 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Background-focused contrastive learning for unpaired image-to-image translation 针对无配对图像到图像翻译的背景聚焦对比学习

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043023

Mingwen Shao, Minggui Han, Lingzhuang Meng, Fukang Liu

Contrastive learning for unpaired image-to-image translation (CUT) aims to learn a mapping from source to target domain with an unpaired dataset, which combines contrastive loss to maximize the mutual information between real and generated images. However, the existing CUT-based methods exhibit unsatisfactory visual quality due to the wrong locating of objects and backgrounds, particularly where it incorrectly transforms the background to match the object pattern in layout-changing datasets. To alleviate the issue, we present background-focused contrastive learning for unpaired image-to-image translation (BFCUT) to improve the background’s consistency between real and its generated images. Specifically, we first generate heat maps to explicitly locate the objects and backgrounds for subsequent contrastive loss and global background similarity loss. Then, the representative queries of objects and backgrounds rather than randomly sampling queries are selected for contrastive loss to promote reality of objects and maintenance of backgrounds. Meanwhile, global semantic vectors with less object information are extracted with the help of heat maps, and we further align the vectors of real images and their corresponding generated images to promote the maintenance of the backgrounds in global background similarity loss. Our BFCUT alleviates the wrong translation of backgrounds and generates more realistic images. Extensive experiments on three datasets demonstrate better quantitative results and qualitative visual effects.

用于无配对图像到图像转换的对比学习（CUT）旨在利用无配对数据集学习从源域到目标域的映射，该映射结合了对比损失以最大化真实图像和生成图像之间的互信息。然而，现有的基于 CUT 的方法由于对物体和背景的错误定位而表现出不尽人意的视觉质量，特别是在布局变化的数据集中，它错误地转换背景以匹配物体模式。为了缓解这一问题，我们提出了针对无配对图像到图像转换的背景聚焦对比学习（BFCUT），以改善真实图像和生成图像之间的背景一致性。具体来说，我们首先生成热图，明确定位对象和背景，以进行后续的对比度损失和全局背景相似性损失。然后，选择具有代表性的对象和背景查询，而不是随机抽样查询进行对比度损失，以促进对象的真实性和背景的维护。同时，借助热图提取对象信息较少的全局语义向量，并进一步对齐真实图像的向量及其对应的生成图像，以促进全局背景相似性损失中的背景维护。我们的 BFCUT 可减轻背景的错误平移，生成更逼真的图像。在三个数据集上进行的广泛实验证明了更好的定量结果和定性视觉效果。

{"title":"Background-focused contrastive learning for unpaired image-to-image translation","authors":"Mingwen Shao, Minggui Han, Lingzhuang Meng, Fukang Liu","doi":"10.1117/1.jei.33.4.043023","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043023","url":null,"abstract":"Contrastive learning for unpaired image-to-image translation (CUT) aims to learn a mapping from source to target domain with an unpaired dataset, which combines contrastive loss to maximize the mutual information between real and generated images. However, the existing CUT-based methods exhibit unsatisfactory visual quality due to the wrong locating of objects and backgrounds, particularly where it incorrectly transforms the background to match the object pattern in layout-changing datasets. To alleviate the issue, we present background-focused contrastive learning for unpaired image-to-image translation (BFCUT) to improve the background’s consistency between real and its generated images. Specifically, we first generate heat maps to explicitly locate the objects and backgrounds for subsequent contrastive loss and global background similarity loss. Then, the representative queries of objects and backgrounds rather than randomly sampling queries are selected for contrastive loss to promote reality of objects and maintenance of backgrounds. Meanwhile, global semantic vectors with less object information are extracted with the help of heat maps, and we further align the vectors of real images and their corresponding generated images to promote the maintenance of the backgrounds in global background similarity loss. Our BFCUT alleviates the wrong translation of backgrounds and generates more realistic images. Extensive experiments on three datasets demonstrate better quantitative results and qualitative visual effects.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"22 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Early quadtree with nested multitype tree partitioning algorithm based on convolution neural network for the versatile video coding standard 基于卷积神经网络的早期四叉树嵌套多类型树分区算法，适用于多功能视频编码标准

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043024

Bouthaina Abdallah, Sonda Ben Jdidia, Fatma Belghith, Mohamed Ali Ben Ayed, Nouri Masmoudi

The Joint Video Experts Team has recently finalized the versatile video coding (VVC) standard, which incorporates various advanced encoding tools. These tools ensure great enhancements in the coding efficiency, leading to a bitrate reduction up to 50% when compared to the previous standard, high-efficiency video coding. However, this enhancement comes at the expense of high computational complexity. Within this context, we address the new quadtree (QT) with nested multitype tree partition block in VVC for all-intra configuration. In fact, we propose a fast intra-coding unit (CU) partition algorithm using various convolution neural network (CNN) classifiers to directly predict the partition mode, skip unnecessary split modes, and early exit the partitioning process. The proposed approach first predicts the QT depth at a CU of size 64×64 by the corresponding CNN classifier. Then four CNN classifiers are applied to predict the partition decision tree at a CU of size 32×32 using multithreshold values and ignore the rate-distortion optimization process to speed up the partition coding time. Thus the developed method is implemented on the reference software VTM 16.2 and tested for different video sequences. The experimental results confirm that the proposed solution achieves an encoding time reduction of about 46% in average, reaching up to 67.3% with an acceptable increase in bitrate and an unsignificant decrease in quality.

联合视频专家组最近最终确定了多功能视频编码（VVC）标准，其中包含各种先进的编码工具。这些工具大大提高了编码效率，与之前的高效视频编码标准相比，比特率最高可降低 50%。然而，这种提升是以高计算复杂性为代价的。在此背景下，我们在 VVC 中采用了新的四叉树（QT）嵌套多型树分割块，以实现全内配置。事实上，我们提出了一种快速编码单元（CU）内分区算法，利用各种卷积神经网络（CNN）分类器直接预测分区模式，跳过不必要的分割模式，并提前退出分区过程。所提出的方法首先通过相应的 CNN 分类器预测大小为 64×64 的 CU 的 QT 深度。然后，应用四个 CNN 分类器，使用多阈值预测 32×32 CU 大小的分区决策树，并忽略速率失真优化过程，以加快分区编码时间。因此，在参考软件 VTM 16.2 上实现了所开发的方法，并针对不同的视频序列进行了测试。实验结果证实，所提出的解决方案平均缩短了约 46%的编码时间，最高可达 67.3%，而且比特率的提高是可以接受的，质量也没有明显下降。

{"title":"Early quadtree with nested multitype tree partitioning algorithm based on convolution neural network for the versatile video coding standard","authors":"Bouthaina Abdallah, Sonda Ben Jdidia, Fatma Belghith, Mohamed Ali Ben Ayed, Nouri Masmoudi","doi":"10.1117/1.jei.33.4.043024","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043024","url":null,"abstract":"The Joint Video Experts Team has recently finalized the versatile video coding (VVC) standard, which incorporates various advanced encoding tools. These tools ensure great enhancements in the coding efficiency, leading to a bitrate reduction up to 50% when compared to the previous standard, high-efficiency video coding. However, this enhancement comes at the expense of high computational complexity. Within this context, we address the new quadtree (QT) with nested multitype tree partition block in VVC for all-intra configuration. In fact, we propose a fast intra-coding unit (CU) partition algorithm using various convolution neural network (CNN) classifiers to directly predict the partition mode, skip unnecessary split modes, and early exit the partitioning process. The proposed approach first predicts the QT depth at a CU of size 64×64 by the corresponding CNN classifier. Then four CNN classifiers are applied to predict the partition decision tree at a CU of size 32×32 using multithreshold values and ignore the rate-distortion optimization process to speed up the partition coding time. Thus the developed method is implemented on the reference software VTM 16.2 and tested for different video sequences. The experimental results confirm that the proposed solution achieves an encoding time reduction of about 46% in average, reaching up to 67.3% with an acceptable increase in bitrate and an unsignificant decrease in quality.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"93 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High-resolution cloud detection network 高分辨率云检测网络

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043027

Jingsheng Li, Tianxiang Xue, Jiayi Zhao, Jingmin Ge, Yufang Min, Wei Su, Kun Zhan

The complexity of clouds, particularly in terms of texture detail at high resolutions, has not been well explored by most existing cloud detection networks. We introduce the high-resolution cloud detection network (HR-cloud-Net), which utilizes a hierarchical high-resolution integration approach. HR-cloud-Net integrates a high-resolution representation module, layer-wise cascaded feature fusion module, and multiresolution pyramid pooling module to effectively capture complex cloud features. This architecture preserves detailed cloud texture information while facilitating feature exchange across different resolutions, thereby enhancing the overall performance in cloud detection. Additionally, an approach is introduced wherein a student view, trained on noisy augmented images, is supervised by a teacher view processing normal images. This setup enables the student to learn from cleaner supervisions provided by the teacher, leading to an improved performance. Extensive evaluations on three optical satellite image cloud detection datasets validate the superior performance of HR-cloud-Net compared with existing methods.

云的复杂性，尤其是高分辨率下的纹理细节，尚未被大多数现有的云检测网络很好地发掘。我们引入了高分辨率云检测网络（HR-cloud-Net），它采用了分层高分辨率集成方法。HR-cloud-Net 集成了高分辨率表示模块、逐层级联特征融合模块和多分辨率金字塔池模块，可有效捕捉复杂的云特征。这种架构既保留了详细的云纹理信息，又促进了不同分辨率之间的特征交换，从而提高了云检测的整体性能。此外，还引入了一种方法，即在处理正常图像的教师视图的监督下，在有噪声的增强图像上训练学生视图。这种设置使学生能够从教师提供的更清洁的监督中学习，从而提高性能。在三个光学卫星图像云检测数据集上进行的广泛评估验证了 HR-cloud-Net 与现有方法相比的卓越性能。

引用次数: 0

Event-frame object detection under dynamic background condition 动态背景条件下的事件帧物体检测

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043028

Wenhao Lu, Zehao Li, Junying Li, Yuncheng Lu, Tony Tae-Hyoung Kim

Neuromorphic vision sensors (NVS) with the features of small data redundancy and transmission latency are widely implemented in Internet of Things applications. Previous studies have developed various object detection algorithms based on NVS’s unique event data format. However, most of these methods are only adaptive for scenarios with stationary backgrounds. Under dynamic background conditions, NVS can also acquire the events of non-target objects due to its mechanism of detecting pixel intensity changes. As a result, the performance of existing detection methods is greatly degraded. To address this shortcoming, we introduce an extra refinement process to the conventional histogram-based (HIST) detection method. For the proposed regions from HIST, we apply a practical decision condition to categorize them as either object-dominant or background-dominant cases. Then, the object-dominant regions undergo a second-time HIST-based region proposal for precise localization, while background-dominant regions employ an upper outline determination strategy for target object identification. Finally, the refined results are tracked using a simplified Kalman filter approach. Evaluated in an outdoor drone surveillance with an event camera, the proposed scheme demonstrates superior performance in both intersection over union and F1 score metrics compared to other methods.

神经形态视觉传感器（NVS）具有数据冗余和传输延迟小的特点，在物联网应用中得到广泛应用。以往的研究基于 NVS 独特的事件数据格式开发了各种物体检测算法。然而，这些方法大多只适用于静态背景的场景。在动态背景条件下，NVS 由于其检测像素强度变化的机制，也能获取非目标物体的事件。因此，现有检测方法的性能大大降低。针对这一缺陷，我们在传统的基于直方图（HIST）的检测方法中引入了额外的细化过程。对于从 HIST 中提出的区域，我们采用了一种实用的决策条件，将其分为物体主导型和背景主导型两种情况。然后，对象主导区域经过第二次基于 HIST 的区域提议以实现精确定位，而背景主导区域则采用上轮廓确定策略来识别目标对象。最后，使用简化的卡尔曼滤波法跟踪细化结果。在使用事件摄像机进行室外无人机监控的评估中，与其他方法相比，所提出的方案在交集大于联合和 F1 分数指标上都表现出了卓越的性能。

{"title":"Event-frame object detection under dynamic background condition","authors":"Wenhao Lu, Zehao Li, Junying Li, Yuncheng Lu, Tony Tae-Hyoung Kim","doi":"10.1117/1.jei.33.4.043028","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043028","url":null,"abstract":"Neuromorphic vision sensors (NVS) with the features of small data redundancy and transmission latency are widely implemented in Internet of Things applications. Previous studies have developed various object detection algorithms based on NVS’s unique event data format. However, most of these methods are only adaptive for scenarios with stationary backgrounds. Under dynamic background conditions, NVS can also acquire the events of non-target objects due to its mechanism of detecting pixel intensity changes. As a result, the performance of existing detection methods is greatly degraded. To address this shortcoming, we introduce an extra refinement process to the conventional histogram-based (HIST) detection method. For the proposed regions from HIST, we apply a practical decision condition to categorize them as either object-dominant or background-dominant cases. Then, the object-dominant regions undergo a second-time HIST-based region proposal for precise localization, while background-dominant regions employ an upper outline determination strategy for target object identification. Finally, the refined results are tracked using a simplified Kalman filter approach. Evaluated in an outdoor drone surveillance with an event camera, the proposed scheme demonstrates superior performance in both intersection over union and F1 score metrics compared to other methods.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"62 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust video hashing with canonical polyadic decomposition and Hahn moments 使用正则多面体分解和哈恩矩的鲁棒视频散列技术

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043007

Zhenjun Tang, Huijiang Zhuang, Mengzhu Yu, Lv Chen, Xiaoping Liang, Xianquan Zhang

Video hashing is an efficient technique for tasks like copy detection and retrieval. This paper utilizes canonical polyadic (CP) decomposition and Hahn moments to design a robust video hashing. The first significant contribution is the secondary frame construction. It uses three weighted techniques to generate three secondary frames for each video group, which can effectively capture features of video frames from different aspects and thus improves discrimination. Another contribution is the deep feature extraction via the ResNet50 and CP decomposition. The use of the ResNet50 can provide rich features and the CP decomposition can learn a compact and discriminative representation from the rich features. In addition, the Hahn moments of secondary frames are taken to construct hash elements. Extensive experiments on the open video dataset demonstrate that the proposed algorithm surpasses several state-of-the-art algorithms in balancing discrimination and robustness.

视频散列是一种高效的技术，可用于复制检测和检索等任务。本文利用典型多面体（CP）分解和哈恩矩设计了一种稳健的视频散列。第一个重大贡献是二级帧构造。它使用三种加权技术为每个视频组生成三个辅助帧，可以从不同方面有效捕捉视频帧的特征，从而提高辨别能力。另一个贡献是通过 ResNet50 和 CP 分解进行深度特征提取。使用 ResNet50 可以提供丰富的特征，而 CP 分解则可以从丰富的特征中学习到紧凑且具有区分度的表示。此外，次要帧的哈恩矩被用来构建哈希元素。在开放视频数据集上进行的大量实验表明，所提出的算法在兼顾区分度和鲁棒性方面超越了几种最先进的算法。

引用次数: 0

Hyperspectral image denoising via self-modulated cross-attention deformable convolutional neural network 通过自调制交叉注意可变形卷积神经网络实现高光谱图像去噪

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043015

Ying Wang, Jie Qiu, Yanxiang Zhao

Compared with ordinary images, hyperspectral images (HSIs) consist of many bands that can provide rich spatial and spectral information and are widely used in remote sensing. However, HSIs are subject to various types of noise due to limited sensor sensitivity; low light intensity in the bands; and corruption during acquisition, transmission, and storage. Therefore, the problem of HSI denoising has attracted extensive attention from society. Although recent HSI denoising methods provide effective solutions in various optimization directions, their performance under real complex noise is still not optimal. To address these issues, this article proposes a self-modulated cross-attention network that fully utilizes spatial and spectral information. The core of the model is the use of deformable convolution to cross-fuse spatial and spectral features to improve the network denoising capability. At the same time, a self-modulating residual block allows the network to transform features in an adaptive manner based on neighboring bands, improving the network’s ability to deal with complex noise, which we call a feature enhancement block. Finally, we propose a three-segment network architecture that improves the stability of the model. The method proposed in this work outperforms other state-of-the-art methods through comparative analysis of experiments in synthetic and real data.

与普通图像相比，高光谱图像由多个波段组成，可提供丰富的空间和光谱信息，在遥感领域得到广泛应用。然而，由于传感器灵敏度有限、波段光照强度低以及采集、传输和存储过程中的损坏，高光谱图像会受到各种噪声的影响。因此，HSI 去噪问题引起了社会的广泛关注。尽管近年来的 HSI 去噪方法在不同的优化方向上提供了有效的解决方案，但其在实际复杂噪声下的表现仍不尽如人意。针对这些问题，本文提出了一种充分利用空间和频谱信息的自调制交叉注意网络。该模型的核心是利用可变形卷积来交叉融合空间和频谱特征，从而提高网络的去噪能力。同时，自调制残差块允许网络根据相邻频带以自适应的方式转换特征，提高网络处理复杂噪声的能力，我们称之为特征增强块。最后，我们提出了一种三段式网络架构，以提高模型的稳定性。通过对合成数据和真实数据的实验对比分析，这项工作中提出的方法优于其他最先进的方法。

{"title":"Hyperspectral image denoising via self-modulated cross-attention deformable convolutional neural network","authors":"Ying Wang, Jie Qiu, Yanxiang Zhao","doi":"10.1117/1.jei.33.4.043015","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043015","url":null,"abstract":"Compared with ordinary images, hyperspectral images (HSIs) consist of many bands that can provide rich spatial and spectral information and are widely used in remote sensing. However, HSIs are subject to various types of noise due to limited sensor sensitivity; low light intensity in the bands; and corruption during acquisition, transmission, and storage. Therefore, the problem of HSI denoising has attracted extensive attention from society. Although recent HSI denoising methods provide effective solutions in various optimization directions, their performance under real complex noise is still not optimal. To address these issues, this article proposes a self-modulated cross-attention network that fully utilizes spatial and spectral information. The core of the model is the use of deformable convolution to cross-fuse spatial and spectral features to improve the network denoising capability. At the same time, a self-modulating residual block allows the network to transform features in an adaptive manner based on neighboring bands, improving the network’s ability to deal with complex noise, which we call a feature enhancement block. Finally, we propose a three-segment network architecture that improves the stability of the model. The method proposed in this work outperforms other state-of-the-art methods through comparative analysis of experiments in synthetic and real data.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"28 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scene adaptive color compensation and multi-weight fusion of underwater image 水下图像的场景自适应色彩补偿和多权重融合

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043031

Muhammad Aon, Huibing Wang, Muhammad Noman Waleed, Yulin Wei, Xianping Fu

Capturing high-quality photos in an underwater atmosphere is complicated, as light attenuation, color distortion, and reduced contrast pose significant challenges. However, one fact usually ignored is the non-uniform texture degradation in distorted images. The loss of comprehensive textures in underwater images poses obstacles in object detection and recognition. To address this problem, we have introduced an image enhancement model called scene adaptive color compensation and multi-weight fusion for extracting fine textural details under diverse environments and enhancing the overall quality of the underwater imagery. Our method blends three input images derived from the adaptive color-compensating and color-corrected version of the degraded image. The first two input images are used to adjust the low contrast and dehazing of the image respectively. Similarly, the third input image is used to extract the fine texture details based on different scales and orientations of the image. Finally, the input images with their associated weight maps are normalized and fused through multi-weight fusion. The proposed model is tested on a distinct set of underwater imagery with varying levels of degradation and frequently outperformed state-of-the-art methods, producing significant improvements in texture visibility, reducing color distortion, and enhancing the overall quality of the submerged images.

在水下环境中拍摄高质量照片非常复杂，因为光衰减、色彩失真和对比度降低都是巨大的挑战。然而，一个通常被忽视的事实是扭曲图像中不均匀的纹理退化。水下图像中全面纹理的损失给物体检测和识别带来了障碍。为解决这一问题，我们引入了一种名为场景自适应色彩补偿和多权重融合的图像增强模型，用于提取不同环境下的精细纹理细节，提高水下图像的整体质量。我们的方法融合了从降级图像的自适应色彩补偿和色彩校正版本中提取的三幅输入图像。前两张输入图像分别用于调整图像的低对比度和去斑。同样，第三张输入图像用于根据图像的不同比例和方向提取精细纹理细节。最后，对输入图像及其相关权重图进行归一化处理，并通过多权重融合进行融合。所提出的模型在一组不同退化程度的水下图像上进行了测试，结果经常优于最先进的方法，在纹理可见度、减少色彩失真和提高水下图像整体质量方面都有显著改善。

{"title":"Scene adaptive color compensation and multi-weight fusion of underwater image","authors":"Muhammad Aon, Huibing Wang, Muhammad Noman Waleed, Yulin Wei, Xianping Fu","doi":"10.1117/1.jei.33.4.043031","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043031","url":null,"abstract":"Capturing high-quality photos in an underwater atmosphere is complicated, as light attenuation, color distortion, and reduced contrast pose significant challenges. However, one fact usually ignored is the non-uniform texture degradation in distorted images. The loss of comprehensive textures in underwater images poses obstacles in object detection and recognition. To address this problem, we have introduced an image enhancement model called scene adaptive color compensation and multi-weight fusion for extracting fine textural details under diverse environments and enhancing the overall quality of the underwater imagery. Our method blends three input images derived from the adaptive color-compensating and color-corrected version of the degraded image. The first two input images are used to adjust the low contrast and dehazing of the image respectively. Similarly, the third input image is used to extract the fine texture details based on different scales and orientations of the image. Finally, the input images with their associated weight maps are normalized and fused through multi-weight fusion. The proposed model is tested on a distinct set of underwater imagery with varying levels of degradation and frequently outperformed state-of-the-art methods, producing significant improvements in texture visibility, reducing color distortion, and enhancing the overall quality of the submerged images.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"36 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Research on image segmentation effect based on denoising preprocessing 基于去噪预处理的图像分割效果研究

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.033033

Lu Ronghui, Tzong-Jer Chen

Our study investigates the impact of denoising preprocessing on the accuracy of image segmentation. Specifically, images with Gaussian noise were segmented using the fuzzy c-means method (FCM), local binary fitting (LBF), the adaptive active contour model coupling local and global information (EVOL_LCV), and the U-Net semantic segmentation method. These methods were then quantitatively evaluated. Subsequently, various denoising techniques, such as mean, median, Gaussian, bilateral filtering, and feed-forward denoising convolutional neural network (DnCNN), were applied to the original images, and the segmentation was performed using the methods mentioned above, followed by another round of quantitative evaluations. The two quantitative evaluations revealed that the segmentation results were clearly enhanced after denoising. Specifically, the Dice similarity coefficient of the FCM segmentation improved by 4% to 44%, LBF improved by 16%, and EVOL_LCV presented limited changes. Additionally, the U-Net network trained on denoised images attained a segmentation improvement of over 5%. The accuracy of traditional segmentation and semantic segmentation of Gaussian noise images is improved effectively using DnCNN.

我们的研究探讨了去噪预处理对图像分割准确性的影响。具体来说，我们使用模糊 c-means 法（FCM）、局部二元拟合法（LBF）、耦合局部和全局信息的自适应主动轮廓模型（EVOL_LCV）以及 U-Net 语义分割法对带有高斯噪声的图像进行了分割。然后对这些方法进行了定量评估。随后，对原始图像应用了各种去噪技术，如均值、中值、高斯、双边滤波和前馈去噪卷积神经网络（DnCNN），并使用上述方法进行了分割，然后进行了另一轮定量评估。两次定量评估结果显示，去噪后的分割效果明显提高。具体来说，FCM 分割的 Dice 相似性系数提高了 4% 至 44%，LBF 提高了 16%，EVOL_LCV 的变化有限。此外，在去噪图像上训练的 U-Net 网络的分割效果提高了 5%以上。使用 DnCNN 有效提高了高斯噪声图像的传统分割和语义分割的准确性。

{"title":"Research on image segmentation effect based on denoising preprocessing","authors":"Lu Ronghui, Tzong-Jer Chen","doi":"10.1117/1.jei.33.3.033033","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033033","url":null,"abstract":"Our study investigates the impact of denoising preprocessing on the accuracy of image segmentation. Specifically, images with Gaussian noise were segmented using the fuzzy c-means method (FCM), local binary fitting (LBF), the adaptive active contour model coupling local and global information (EVOL_LCV), and the U-Net semantic segmentation method. These methods were then quantitatively evaluated. Subsequently, various denoising techniques, such as mean, median, Gaussian, bilateral filtering, and feed-forward denoising convolutional neural network (DnCNN), were applied to the original images, and the segmentation was performed using the methods mentioned above, followed by another round of quantitative evaluations. The two quantitative evaluations revealed that the segmentation results were clearly enhanced after denoising. Specifically, the Dice similarity coefficient of the FCM segmentation improved by 4% to 44%, LBF improved by 16%, and EVOL_LCV presented limited changes. Additionally, the U-Net network trained on denoised images attained a segmentation improvement of over 5%. The accuracy of traditional segmentation and semantic segmentation of Gaussian noise images is improved effectively using DnCNN.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"32 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Special Section Guest Editorial: Quality Control by Artificial Vision VII 特别栏目特约编辑：人工视觉质量控制 VII

IF 1.1 4区计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

Journal of Electronic Imaging

Pub Date : 2024-06-01 DOI: 10.1117/1.jei.33.3.031201

Igor Jovančević, Jean-José Orteu

Guest Editors Igor Jovančević and Jean-José Orteu introduce the Special Section on Quality Control by Artificial Vision VII.

特邀编辑 Igor Jovančević 和 Jean-José Orteu 介绍第七期人工视觉质量控制特辑。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Electronic Imaging

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀