首页 > 最新文献

Journal of Electronic Imaging最新文献

英文 中文
Scale separation: video crowd counting with different density maps 规模分离:使用不同密度图进行视频人群计数
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043016
Ao Zhang, Xin Deng, Baoying Liu, Weiwei Zhang, Jun Guo, Linrui Xie
Most crowd counting methods rely on integrating density maps for prediction, but they encounter performance degradation in the face of density variations. Existing methods primarily employ a multi-scale architecture to mitigate this issue. However, few approaches concurrently consider both scale and timing information. We propose a scale-divided architecture for video crowd counting. Initially, density maps of different Gaussian scales are employed to retain information at various scales, accommodating scale changes in images. Subsequently, we observe that the spatiotemporal network places greater emphasis on individual locations, prompting us to aggregate temporal information at a specific scale. This design enables the temporal model to acquire more spatial information and alleviate occlusion issues. Experimental results on various public datasets demonstrate the superior performance of our proposed method.
大多数人群计数方法都依靠整合密度图来进行预测,但面对密度变化,这些方法的性能会下降。现有方法主要采用多尺度架构来缓解这一问题。然而,很少有方法能同时考虑尺度和时间信息。我们提出了一种用于视频人群计数的尺度划分架构。最初,我们采用不同高斯尺度的密度图来保留不同尺度的信息,以适应图像的尺度变化。随后,我们观察到时空网络更重视单个位置,这促使我们在特定尺度上汇总时间信息。这种设计使时空模型能够获取更多空间信息,并缓解遮挡问题。在各种公共数据集上的实验结果表明,我们提出的方法性能优越。
{"title":"Scale separation: video crowd counting with different density maps","authors":"Ao Zhang, Xin Deng, Baoying Liu, Weiwei Zhang, Jun Guo, Linrui Xie","doi":"10.1117/1.jei.33.4.043016","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043016","url":null,"abstract":"Most crowd counting methods rely on integrating density maps for prediction, but they encounter performance degradation in the face of density variations. Existing methods primarily employ a multi-scale architecture to mitigate this issue. However, few approaches concurrently consider both scale and timing information. We propose a scale-divided architecture for video crowd counting. Initially, density maps of different Gaussian scales are employed to retain information at various scales, accommodating scale changes in images. Subsequently, we observe that the spatiotemporal network places greater emphasis on individual locations, prompting us to aggregate temporal information at a specific scale. This design enables the temporal model to acquire more spatial information and alleviate occlusion issues. Experimental results on various public datasets demonstrate the superior performance of our proposed method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Early quadtree with nested multitype tree partitioning algorithm based on convolution neural network for the versatile video coding standard 基于卷积神经网络的早期四叉树嵌套多类型树分区算法,适用于多功能视频编码标准
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043024
Bouthaina Abdallah, Sonda Ben Jdidia, Fatma Belghith, Mohamed Ali Ben Ayed, Nouri Masmoudi
The Joint Video Experts Team has recently finalized the versatile video coding (VVC) standard, which incorporates various advanced encoding tools. These tools ensure great enhancements in the coding efficiency, leading to a bitrate reduction up to 50% when compared to the previous standard, high-efficiency video coding. However, this enhancement comes at the expense of high computational complexity. Within this context, we address the new quadtree (QT) with nested multitype tree partition block in VVC for all-intra configuration. In fact, we propose a fast intra-coding unit (CU) partition algorithm using various convolution neural network (CNN) classifiers to directly predict the partition mode, skip unnecessary split modes, and early exit the partitioning process. The proposed approach first predicts the QT depth at a CU of size 64×64 by the corresponding CNN classifier. Then four CNN classifiers are applied to predict the partition decision tree at a CU of size 32×32 using multithreshold values and ignore the rate-distortion optimization process to speed up the partition coding time. Thus the developed method is implemented on the reference software VTM 16.2 and tested for different video sequences. The experimental results confirm that the proposed solution achieves an encoding time reduction of about 46% in average, reaching up to 67.3% with an acceptable increase in bitrate and an unsignificant decrease in quality.
联合视频专家组最近最终确定了多功能视频编码(VVC)标准,其中包含各种先进的编码工具。这些工具大大提高了编码效率,与之前的高效视频编码标准相比,比特率最高可降低 50%。然而,这种提升是以高计算复杂性为代价的。在此背景下,我们在 VVC 中采用了新的四叉树(QT)嵌套多型树分割块,以实现全内配置。事实上,我们提出了一种快速编码单元(CU)内分区算法,利用各种卷积神经网络(CNN)分类器直接预测分区模式,跳过不必要的分割模式,并提前退出分区过程。所提出的方法首先通过相应的 CNN 分类器预测大小为 64×64 的 CU 的 QT 深度。然后,应用四个 CNN 分类器,使用多阈值预测 32×32 CU 大小的分区决策树,并忽略速率失真优化过程,以加快分区编码时间。因此,在参考软件 VTM 16.2 上实现了所开发的方法,并针对不同的视频序列进行了测试。实验结果证实,所提出的解决方案平均缩短了约 46%的编码时间,最高可达 67.3%,而且比特率的提高是可以接受的,质量也没有明显下降。
{"title":"Early quadtree with nested multitype tree partitioning algorithm based on convolution neural network for the versatile video coding standard","authors":"Bouthaina Abdallah, Sonda Ben Jdidia, Fatma Belghith, Mohamed Ali Ben Ayed, Nouri Masmoudi","doi":"10.1117/1.jei.33.4.043024","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043024","url":null,"abstract":"The Joint Video Experts Team has recently finalized the versatile video coding (VVC) standard, which incorporates various advanced encoding tools. These tools ensure great enhancements in the coding efficiency, leading to a bitrate reduction up to 50% when compared to the previous standard, high-efficiency video coding. However, this enhancement comes at the expense of high computational complexity. Within this context, we address the new quadtree (QT) with nested multitype tree partition block in VVC for all-intra configuration. In fact, we propose a fast intra-coding unit (CU) partition algorithm using various convolution neural network (CNN) classifiers to directly predict the partition mode, skip unnecessary split modes, and early exit the partitioning process. The proposed approach first predicts the QT depth at a CU of size 64×64 by the corresponding CNN classifier. Then four CNN classifiers are applied to predict the partition decision tree at a CU of size 32×32 using multithreshold values and ignore the rate-distortion optimization process to speed up the partition coding time. Thus the developed method is implemented on the reference software VTM 16.2 and tested for different video sequences. The experimental results confirm that the proposed solution achieves an encoding time reduction of about 46% in average, reaching up to 67.3% with an acceptable increase in bitrate and an unsignificant decrease in quality.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Background-focused contrastive learning for unpaired image-to-image translation 针对无配对图像到图像翻译的背景聚焦对比学习
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043023
Mingwen Shao, Minggui Han, Lingzhuang Meng, Fukang Liu
Contrastive learning for unpaired image-to-image translation (CUT) aims to learn a mapping from source to target domain with an unpaired dataset, which combines contrastive loss to maximize the mutual information between real and generated images. However, the existing CUT-based methods exhibit unsatisfactory visual quality due to the wrong locating of objects and backgrounds, particularly where it incorrectly transforms the background to match the object pattern in layout-changing datasets. To alleviate the issue, we present background-focused contrastive learning for unpaired image-to-image translation (BFCUT) to improve the background’s consistency between real and its generated images. Specifically, we first generate heat maps to explicitly locate the objects and backgrounds for subsequent contrastive loss and global background similarity loss. Then, the representative queries of objects and backgrounds rather than randomly sampling queries are selected for contrastive loss to promote reality of objects and maintenance of backgrounds. Meanwhile, global semantic vectors with less object information are extracted with the help of heat maps, and we further align the vectors of real images and their corresponding generated images to promote the maintenance of the backgrounds in global background similarity loss. Our BFCUT alleviates the wrong translation of backgrounds and generates more realistic images. Extensive experiments on three datasets demonstrate better quantitative results and qualitative visual effects.
用于无配对图像到图像转换的对比学习(CUT)旨在利用无配对数据集学习从源域到目标域的映射,该映射结合了对比损失以最大化真实图像和生成图像之间的互信息。然而,现有的基于 CUT 的方法由于对物体和背景的错误定位而表现出不尽人意的视觉质量,特别是在布局变化的数据集中,它错误地转换背景以匹配物体模式。为了缓解这一问题,我们提出了针对无配对图像到图像转换的背景聚焦对比学习(BFCUT),以改善真实图像和生成图像之间的背景一致性。具体来说,我们首先生成热图,明确定位对象和背景,以进行后续的对比度损失和全局背景相似性损失。然后,选择具有代表性的对象和背景查询,而不是随机抽样查询进行对比度损失,以促进对象的真实性和背景的维护。同时,借助热图提取对象信息较少的全局语义向量,并进一步对齐真实图像的向量及其对应的生成图像,以促进全局背景相似性损失中的背景维护。我们的 BFCUT 可减轻背景的错误平移,生成更逼真的图像。在三个数据集上进行的广泛实验证明了更好的定量结果和定性视觉效果。
{"title":"Background-focused contrastive learning for unpaired image-to-image translation","authors":"Mingwen Shao, Minggui Han, Lingzhuang Meng, Fukang Liu","doi":"10.1117/1.jei.33.4.043023","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043023","url":null,"abstract":"Contrastive learning for unpaired image-to-image translation (CUT) aims to learn a mapping from source to target domain with an unpaired dataset, which combines contrastive loss to maximize the mutual information between real and generated images. However, the existing CUT-based methods exhibit unsatisfactory visual quality due to the wrong locating of objects and backgrounds, particularly where it incorrectly transforms the background to match the object pattern in layout-changing datasets. To alleviate the issue, we present background-focused contrastive learning for unpaired image-to-image translation (BFCUT) to improve the background’s consistency between real and its generated images. Specifically, we first generate heat maps to explicitly locate the objects and backgrounds for subsequent contrastive loss and global background similarity loss. Then, the representative queries of objects and backgrounds rather than randomly sampling queries are selected for contrastive loss to promote reality of objects and maintenance of backgrounds. Meanwhile, global semantic vectors with less object information are extracted with the help of heat maps, and we further align the vectors of real images and their corresponding generated images to promote the maintenance of the backgrounds in global background similarity loss. Our BFCUT alleviates the wrong translation of backgrounds and generates more realistic images. Extensive experiments on three datasets demonstrate better quantitative results and qualitative visual effects.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-resolution cloud detection network 高分辨率云检测网络
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043027
Jingsheng Li, Tianxiang Xue, Jiayi Zhao, Jingmin Ge, Yufang Min, Wei Su, Kun Zhan
The complexity of clouds, particularly in terms of texture detail at high resolutions, has not been well explored by most existing cloud detection networks. We introduce the high-resolution cloud detection network (HR-cloud-Net), which utilizes a hierarchical high-resolution integration approach. HR-cloud-Net integrates a high-resolution representation module, layer-wise cascaded feature fusion module, and multiresolution pyramid pooling module to effectively capture complex cloud features. This architecture preserves detailed cloud texture information while facilitating feature exchange across different resolutions, thereby enhancing the overall performance in cloud detection. Additionally, an approach is introduced wherein a student view, trained on noisy augmented images, is supervised by a teacher view processing normal images. This setup enables the student to learn from cleaner supervisions provided by the teacher, leading to an improved performance. Extensive evaluations on three optical satellite image cloud detection datasets validate the superior performance of HR-cloud-Net compared with existing methods.
云的复杂性,尤其是高分辨率下的纹理细节,尚未被大多数现有的云检测网络很好地发掘。我们引入了高分辨率云检测网络(HR-cloud-Net),它采用了分层高分辨率集成方法。HR-cloud-Net 集成了高分辨率表示模块、逐层级联特征融合模块和多分辨率金字塔池模块,可有效捕捉复杂的云特征。这种架构既保留了详细的云纹理信息,又促进了不同分辨率之间的特征交换,从而提高了云检测的整体性能。此外,还引入了一种方法,即在处理正常图像的教师视图的监督下,在有噪声的增强图像上训练学生视图。这种设置使学生能够从教师提供的更清洁的监督中学习,从而提高性能。在三个光学卫星图像云检测数据集上进行的广泛评估验证了 HR-cloud-Net 与现有方法相比的卓越性能。
{"title":"High-resolution cloud detection network","authors":"Jingsheng Li, Tianxiang Xue, Jiayi Zhao, Jingmin Ge, Yufang Min, Wei Su, Kun Zhan","doi":"10.1117/1.jei.33.4.043027","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043027","url":null,"abstract":"The complexity of clouds, particularly in terms of texture detail at high resolutions, has not been well explored by most existing cloud detection networks. We introduce the high-resolution cloud detection network (HR-cloud-Net), which utilizes a hierarchical high-resolution integration approach. HR-cloud-Net integrates a high-resolution representation module, layer-wise cascaded feature fusion module, and multiresolution pyramid pooling module to effectively capture complex cloud features. This architecture preserves detailed cloud texture information while facilitating feature exchange across different resolutions, thereby enhancing the overall performance in cloud detection. Additionally, an approach is introduced wherein a student view, trained on noisy augmented images, is supervised by a teacher view processing normal images. This setup enables the student to learn from cleaner supervisions provided by the teacher, leading to an improved performance. Extensive evaluations on three optical satellite image cloud detection datasets validate the superior performance of HR-cloud-Net compared with existing methods.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Event-frame object detection under dynamic background condition 动态背景条件下的事件帧物体检测
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043028
Wenhao Lu, Zehao Li, Junying Li, Yuncheng Lu, Tony Tae-Hyoung Kim
Neuromorphic vision sensors (NVS) with the features of small data redundancy and transmission latency are widely implemented in Internet of Things applications. Previous studies have developed various object detection algorithms based on NVS’s unique event data format. However, most of these methods are only adaptive for scenarios with stationary backgrounds. Under dynamic background conditions, NVS can also acquire the events of non-target objects due to its mechanism of detecting pixel intensity changes. As a result, the performance of existing detection methods is greatly degraded. To address this shortcoming, we introduce an extra refinement process to the conventional histogram-based (HIST) detection method. For the proposed regions from HIST, we apply a practical decision condition to categorize them as either object-dominant or background-dominant cases. Then, the object-dominant regions undergo a second-time HIST-based region proposal for precise localization, while background-dominant regions employ an upper outline determination strategy for target object identification. Finally, the refined results are tracked using a simplified Kalman filter approach. Evaluated in an outdoor drone surveillance with an event camera, the proposed scheme demonstrates superior performance in both intersection over union and F1 score metrics compared to other methods.
神经形态视觉传感器(NVS)具有数据冗余和传输延迟小的特点,在物联网应用中得到广泛应用。以往的研究基于 NVS 独特的事件数据格式开发了各种物体检测算法。然而,这些方法大多只适用于静态背景的场景。在动态背景条件下,NVS 由于其检测像素强度变化的机制,也能获取非目标物体的事件。因此,现有检测方法的性能大大降低。针对这一缺陷,我们在传统的基于直方图(HIST)的检测方法中引入了额外的细化过程。对于从 HIST 中提出的区域,我们采用了一种实用的决策条件,将其分为物体主导型和背景主导型两种情况。然后,对象主导区域经过第二次基于 HIST 的区域提议以实现精确定位,而背景主导区域则采用上轮廓确定策略来识别目标对象。最后,使用简化的卡尔曼滤波法跟踪细化结果。在使用事件摄像机进行室外无人机监控的评估中,与其他方法相比,所提出的方案在交集大于联合和 F1 分数指标上都表现出了卓越的性能。
{"title":"Event-frame object detection under dynamic background condition","authors":"Wenhao Lu, Zehao Li, Junying Li, Yuncheng Lu, Tony Tae-Hyoung Kim","doi":"10.1117/1.jei.33.4.043028","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043028","url":null,"abstract":"Neuromorphic vision sensors (NVS) with the features of small data redundancy and transmission latency are widely implemented in Internet of Things applications. Previous studies have developed various object detection algorithms based on NVS’s unique event data format. However, most of these methods are only adaptive for scenarios with stationary backgrounds. Under dynamic background conditions, NVS can also acquire the events of non-target objects due to its mechanism of detecting pixel intensity changes. As a result, the performance of existing detection methods is greatly degraded. To address this shortcoming, we introduce an extra refinement process to the conventional histogram-based (HIST) detection method. For the proposed regions from HIST, we apply a practical decision condition to categorize them as either object-dominant or background-dominant cases. Then, the object-dominant regions undergo a second-time HIST-based region proposal for precise localization, while background-dominant regions employ an upper outline determination strategy for target object identification. Finally, the refined results are tracked using a simplified Kalman filter approach. Evaluated in an outdoor drone surveillance with an event camera, the proposed scheme demonstrates superior performance in both intersection over union and F1 score metrics compared to other methods.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust video hashing with canonical polyadic decomposition and Hahn moments 使用正则多面体分解和哈恩矩的鲁棒视频散列技术
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043007
Zhenjun Tang, Huijiang Zhuang, Mengzhu Yu, Lv Chen, Xiaoping Liang, Xianquan Zhang
Video hashing is an efficient technique for tasks like copy detection and retrieval. This paper utilizes canonical polyadic (CP) decomposition and Hahn moments to design a robust video hashing. The first significant contribution is the secondary frame construction. It uses three weighted techniques to generate three secondary frames for each video group, which can effectively capture features of video frames from different aspects and thus improves discrimination. Another contribution is the deep feature extraction via the ResNet50 and CP decomposition. The use of the ResNet50 can provide rich features and the CP decomposition can learn a compact and discriminative representation from the rich features. In addition, the Hahn moments of secondary frames are taken to construct hash elements. Extensive experiments on the open video dataset demonstrate that the proposed algorithm surpasses several state-of-the-art algorithms in balancing discrimination and robustness.
视频散列是一种高效的技术,可用于复制检测和检索等任务。本文利用典型多面体(CP)分解和哈恩矩设计了一种稳健的视频散列。第一个重大贡献是二级帧构造。它使用三种加权技术为每个视频组生成三个辅助帧,可以从不同方面有效捕捉视频帧的特征,从而提高辨别能力。另一个贡献是通过 ResNet50 和 CP 分解进行深度特征提取。使用 ResNet50 可以提供丰富的特征,而 CP 分解则可以从丰富的特征中学习到紧凑且具有区分度的表示。此外,次要帧的哈恩矩被用来构建哈希元素。在开放视频数据集上进行的大量实验表明,所提出的算法在兼顾区分度和鲁棒性方面超越了几种最先进的算法。
{"title":"Robust video hashing with canonical polyadic decomposition and Hahn moments","authors":"Zhenjun Tang, Huijiang Zhuang, Mengzhu Yu, Lv Chen, Xiaoping Liang, Xianquan Zhang","doi":"10.1117/1.jei.33.4.043007","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043007","url":null,"abstract":"Video hashing is an efficient technique for tasks like copy detection and retrieval. This paper utilizes canonical polyadic (CP) decomposition and Hahn moments to design a robust video hashing. The first significant contribution is the secondary frame construction. It uses three weighted techniques to generate three secondary frames for each video group, which can effectively capture features of video frames from different aspects and thus improves discrimination. Another contribution is the deep feature extraction via the ResNet50 and CP decomposition. The use of the ResNet50 can provide rich features and the CP decomposition can learn a compact and discriminative representation from the rich features. In addition, the Hahn moments of secondary frames are taken to construct hash elements. Extensive experiments on the open video dataset demonstrate that the proposed algorithm surpasses several state-of-the-art algorithms in balancing discrimination and robustness.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scene adaptive color compensation and multi-weight fusion of underwater image 水下图像的场景自适应色彩补偿和多权重融合
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043031
Muhammad Aon, Huibing Wang, Muhammad Noman Waleed, Yulin Wei, Xianping Fu
Capturing high-quality photos in an underwater atmosphere is complicated, as light attenuation, color distortion, and reduced contrast pose significant challenges. However, one fact usually ignored is the non-uniform texture degradation in distorted images. The loss of comprehensive textures in underwater images poses obstacles in object detection and recognition. To address this problem, we have introduced an image enhancement model called scene adaptive color compensation and multi-weight fusion for extracting fine textural details under diverse environments and enhancing the overall quality of the underwater imagery. Our method blends three input images derived from the adaptive color-compensating and color-corrected version of the degraded image. The first two input images are used to adjust the low contrast and dehazing of the image respectively. Similarly, the third input image is used to extract the fine texture details based on different scales and orientations of the image. Finally, the input images with their associated weight maps are normalized and fused through multi-weight fusion. The proposed model is tested on a distinct set of underwater imagery with varying levels of degradation and frequently outperformed state-of-the-art methods, producing significant improvements in texture visibility, reducing color distortion, and enhancing the overall quality of the submerged images.
在水下环境中拍摄高质量照片非常复杂,因为光衰减、色彩失真和对比度降低都是巨大的挑战。然而,一个通常被忽视的事实是扭曲图像中不均匀的纹理退化。水下图像中全面纹理的损失给物体检测和识别带来了障碍。为解决这一问题,我们引入了一种名为场景自适应色彩补偿和多权重融合的图像增强模型,用于提取不同环境下的精细纹理细节,提高水下图像的整体质量。我们的方法融合了从降级图像的自适应色彩补偿和色彩校正版本中提取的三幅输入图像。前两张输入图像分别用于调整图像的低对比度和去斑。同样,第三张输入图像用于根据图像的不同比例和方向提取精细纹理细节。最后,对输入图像及其相关权重图进行归一化处理,并通过多权重融合进行融合。所提出的模型在一组不同退化程度的水下图像上进行了测试,结果经常优于最先进的方法,在纹理可见度、减少色彩失真和提高水下图像整体质量方面都有显著改善。
{"title":"Scene adaptive color compensation and multi-weight fusion of underwater image","authors":"Muhammad Aon, Huibing Wang, Muhammad Noman Waleed, Yulin Wei, Xianping Fu","doi":"10.1117/1.jei.33.4.043031","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043031","url":null,"abstract":"Capturing high-quality photos in an underwater atmosphere is complicated, as light attenuation, color distortion, and reduced contrast pose significant challenges. However, one fact usually ignored is the non-uniform texture degradation in distorted images. The loss of comprehensive textures in underwater images poses obstacles in object detection and recognition. To address this problem, we have introduced an image enhancement model called scene adaptive color compensation and multi-weight fusion for extracting fine textural details under diverse environments and enhancing the overall quality of the underwater imagery. Our method blends three input images derived from the adaptive color-compensating and color-corrected version of the degraded image. The first two input images are used to adjust the low contrast and dehazing of the image respectively. Similarly, the third input image is used to extract the fine texture details based on different scales and orientations of the image. Finally, the input images with their associated weight maps are normalized and fused through multi-weight fusion. The proposed model is tested on a distinct set of underwater imagery with varying levels of degradation and frequently outperformed state-of-the-art methods, producing significant improvements in texture visibility, reducing color distortion, and enhancing the overall quality of the submerged images.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyperspectral image denoising via self-modulated cross-attention deformable convolutional neural network 通过自调制交叉注意可变形卷积神经网络实现高光谱图像去噪
IF 1.1 4区 计算机科学 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-07-01 DOI: 10.1117/1.jei.33.4.043015
Ying Wang, Jie Qiu, Yanxiang Zhao
Compared with ordinary images, hyperspectral images (HSIs) consist of many bands that can provide rich spatial and spectral information and are widely used in remote sensing. However, HSIs are subject to various types of noise due to limited sensor sensitivity; low light intensity in the bands; and corruption during acquisition, transmission, and storage. Therefore, the problem of HSI denoising has attracted extensive attention from society. Although recent HSI denoising methods provide effective solutions in various optimization directions, their performance under real complex noise is still not optimal. To address these issues, this article proposes a self-modulated cross-attention network that fully utilizes spatial and spectral information. The core of the model is the use of deformable convolution to cross-fuse spatial and spectral features to improve the network denoising capability. At the same time, a self-modulating residual block allows the network to transform features in an adaptive manner based on neighboring bands, improving the network’s ability to deal with complex noise, which we call a feature enhancement block. Finally, we propose a three-segment network architecture that improves the stability of the model. The method proposed in this work outperforms other state-of-the-art methods through comparative analysis of experiments in synthetic and real data.
与普通图像相比,高光谱图像由多个波段组成,可提供丰富的空间和光谱信息,在遥感领域得到广泛应用。然而,由于传感器灵敏度有限、波段光照强度低以及采集、传输和存储过程中的损坏,高光谱图像会受到各种噪声的影响。因此,HSI 去噪问题引起了社会的广泛关注。尽管近年来的 HSI 去噪方法在不同的优化方向上提供了有效的解决方案,但其在实际复杂噪声下的表现仍不尽如人意。针对这些问题,本文提出了一种充分利用空间和频谱信息的自调制交叉注意网络。该模型的核心是利用可变形卷积来交叉融合空间和频谱特征,从而提高网络的去噪能力。同时,自调制残差块允许网络根据相邻频带以自适应的方式转换特征,提高网络处理复杂噪声的能力,我们称之为特征增强块。最后,我们提出了一种三段式网络架构,以提高模型的稳定性。通过对合成数据和真实数据的实验对比分析,这项工作中提出的方法优于其他最先进的方法。
{"title":"Hyperspectral image denoising via self-modulated cross-attention deformable convolutional neural network","authors":"Ying Wang, Jie Qiu, Yanxiang Zhao","doi":"10.1117/1.jei.33.4.043015","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043015","url":null,"abstract":"Compared with ordinary images, hyperspectral images (HSIs) consist of many bands that can provide rich spatial and spectral information and are widely used in remote sensing. However, HSIs are subject to various types of noise due to limited sensor sensitivity; low light intensity in the bands; and corruption during acquisition, transmission, and storage. Therefore, the problem of HSI denoising has attracted extensive attention from society. Although recent HSI denoising methods provide effective solutions in various optimization directions, their performance under real complex noise is still not optimal. To address these issues, this article proposes a self-modulated cross-attention network that fully utilizes spatial and spectral information. The core of the model is the use of deformable convolution to cross-fuse spatial and spectral features to improve the network denoising capability. At the same time, a self-modulating residual block allows the network to transform features in an adaptive manner based on neighboring bands, improving the network’s ability to deal with complex noise, which we call a feature enhancement block. Finally, we propose a three-segment network architecture that improves the stability of the model. The method proposed in this work outperforms other state-of-the-art methods through comparative analysis of experiments in synthetic and real data.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141612603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Symmetric image compression network with improved normalization attention mechanism 具有改进的归一化关注机制的对称图像压缩网络
IF 1.1 4区 计算机科学 Q3 Engineering Pub Date : 2024-06-11 DOI: 10.1117/1.jei.33.3.033028
Shen-Chuan Tai, Chia-Mao Yeh, Yu-Ting Lee, Wesley Huang
{"title":"Symmetric image compression network with improved normalization attention mechanism","authors":"Shen-Chuan Tai, Chia-Mao Yeh, Yu-Ting Lee, Wesley Huang","doi":"10.1117/1.jei.33.3.033028","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033028","url":null,"abstract":"","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141358945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal co-attention fusion network for video splicing localization 用于视频拼接定位的时空共关注融合网络
IF 1.1 4区 计算机科学 Q3 Engineering Pub Date : 2024-06-07 DOI: 10.1117/1.jei.33.3.033027
Man Lin, Gang Cao, Zijie Lou, Chi Zhang
{"title":"Spatio-temporal co-attention fusion network for video splicing localization","authors":"Man Lin, Gang Cao, Zijie Lou, Chi Zhang","doi":"10.1117/1.jei.33.3.033027","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033027","url":null,"abstract":"","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141375328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Electronic Imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1