首页 > 最新文献

Journal of Visual Communication and Image Representation最新文献

英文 中文
Underwater image enhancement via multicolor space-guided curve estimation 通过多色空间引导曲线估算实现水下图像增强
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104240

High-quality images are of prime importance for vision tasks in underwater environment, ocean exploitation and marine eco-environment protection. However, due to attenuation and scattering related to wavelength and distance, underwater images often suffer from severe distortion, like color cast and loss of detail, challenging the capture of high-quality images. Existing methods mostly focus on pixel-wise mapping from input to output images, which disrupts the latent relationships between neighboring pixels and suffers limited flexibility of linear mapping. Additionally, most methods operate solely within RGB color space, which is insensitive to some image properties such as luminance and saturation, may be inconsistent with human visual perception. In this paper, we combine global and local enhancement, formulating underwater image global enhancement as a task of image-specific piecewise curve estimation based on a deep network, introducing three color spaces (RGB, HSV, CIELab). Our network learns piecewise nonlinear curves specific to different channels of multiple color spaces, thereby introducing highly flexible nonlinear mapping and achieving targeted adjustment. Besides, we employ channel-wise attention mechanism to allocate weights for different adjustment results from multiple color spaces, combining their advantages to further enhance image properties such as luminance, saturation, and color, aiming to improve image quality and align it more closely with human visual perception. To assess the performance of the network, extensive experiments are conducted on synthetic and real-world underwater image datasets, with comparisons made against state-of-the-art methods. Both quantitative and qualitative results indicate our network’s remarkable performance in both visual quality and quantitative metrics.

高质量图像对于水下环境、海洋开发和海洋生态环境保护中的视觉任务至关重要。然而,由于与波长和距离有关的衰减和散射,水下图像通常会出现严重失真,如偏色和细节丢失,这给高质量图像的捕捉带来了挑战。现有的方法大多侧重于从输入图像到输出图像的像素映射,这会破坏相邻像素之间的潜在关系,并且线性映射的灵活性有限。此外,大多数方法仅在 RGB 色彩空间内运行,对亮度和饱和度等一些图像属性不敏感,可能与人类的视觉感知不一致。在本文中,我们将全局增强和局部增强相结合,将水下图像全局增强表述为基于深度网络的特定图像片断曲线估计任务,并引入了三种色彩空间(RGB、HSV、CIELab)。我们的网络可以学习多种色彩空间不同通道的分片非线性曲线,从而引入高度灵活的非线性映射,实现有针对性的调整。此外,我们还采用了通道关注机制,为多个色彩空间的不同调整结果分配权重,结合它们的优势进一步增强亮度、饱和度和色彩等图像属性,从而提高图像质量,使其更贴近人类的视觉感知。为了评估该网络的性能,我们在合成和真实世界的水下图像数据集上进行了广泛的实验,并与最先进的方法进行了比较。定量和定性结果都表明,我们的网络在视觉质量和定量指标方面都表现出色。
{"title":"Underwater image enhancement via multicolor space-guided curve estimation","authors":"","doi":"10.1016/j.jvcir.2024.104240","DOIUrl":"10.1016/j.jvcir.2024.104240","url":null,"abstract":"<div><p>High-quality images are of prime importance for vision tasks in underwater environment, ocean exploitation and marine eco-environment protection. However, due to attenuation and scattering related to wavelength and distance, underwater images often suffer from severe distortion, like color cast and loss of detail, challenging the capture of high-quality images. Existing methods mostly focus on pixel-wise mapping from input to output images, which disrupts the latent relationships between neighboring pixels and suffers limited flexibility of linear mapping. Additionally, most methods operate solely within RGB color space, which is insensitive to some image properties such as luminance and saturation, may be inconsistent with human visual perception. In this paper, we combine global and local enhancement, formulating underwater image global enhancement as a task of image-specific piecewise curve estimation based on a deep network, introducing three color spaces (RGB, HSV, CIELab). Our network learns piecewise nonlinear curves specific to different channels of multiple color spaces, thereby introducing highly flexible nonlinear mapping and achieving targeted adjustment. Besides, we employ channel-wise attention mechanism to allocate weights for different adjustment results from multiple color spaces, combining their advantages to further enhance image properties such as luminance, saturation, and color, aiming to improve image quality and align it more closely with human visual perception. To assess the performance of the network, extensive experiments are conducted on synthetic and real-world underwater image datasets, with comparisons made against state-of-the-art methods. Both quantitative and qualitative results indicate our network’s remarkable performance in both visual quality and quantitative metrics.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141842703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modification in spatial, extraction from transform: Keyless side-information steganography for JPEG 从空间中修改,从变换中提取:用于 JPEG 的无密钥侧信息隐写术
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104249

The state of the art steganography approaches strictly assume that the receiver has access to a steganographic key. This limitation was mitigated for images in spacial domain, but the approach does not apply to JPEG images. In this paper, we introduce a keyless steganography scheme for JPEG images. Despite the spatial domain counterpart, our approach for the JPEG domain effectively preserves higher-order statistical models that are used in steganalysis. We show that our approach does not degrade image quality either. The proposed approach is a Side-Information (SI) steganography in the sense that its input is a never-compressed image. Another characteristic of the proposed approach is the separation of the embedding modification and data extraction domains, which can initiate further studies of similar approaches in the future.

最先进的隐写术方法严格假定接收者可以获得隐写密钥。这种限制在空间域的图像中得到了缓解,但这种方法不适用于 JPEG 图像。在本文中,我们介绍了一种针对 JPEG 图像的无密钥隐写术方案。尽管与空间域对应,我们的 JPEG 域方法有效地保留了隐写分析中使用的高阶统计模型。我们的研究表明,我们的方法也不会降低图像质量。我们提出的方法是一种侧信息(SI)隐写技术,因为它的输入是一幅从未压缩过的图像。所提方法的另一个特点是将嵌入修改域和数据提取域分离开来,这有助于今后对类似方法的进一步研究。
{"title":"Modification in spatial, extraction from transform: Keyless side-information steganography for JPEG","authors":"","doi":"10.1016/j.jvcir.2024.104249","DOIUrl":"10.1016/j.jvcir.2024.104249","url":null,"abstract":"<div><p>The state of the art steganography approaches strictly assume that the receiver has access to a steganographic key. This limitation was mitigated for images in spacial domain, but the approach does not apply to JPEG images. In this paper, we introduce a keyless steganography scheme for JPEG images. Despite the spatial domain counterpart, our approach for the JPEG domain effectively preserves higher-order statistical models that are used in steganalysis. We show that our approach does not degrade image quality either. The proposed approach is a Side-Information (SI) steganography in the sense that its input is a never-compressed image. Another characteristic of the proposed approach is the separation of the embedding modification and data extraction domains, which can initiate further studies of similar approaches in the future.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Audio–video collaborative JND estimation model for multimedia applications 多媒体应用的音视频协作 JND 估算模型
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104254

With the rapid development of the Internet and multimedia technologies, multimedia applications integrating audio and video are becoming increasingly prevalent in both everyday life and professional environments. A critical challenge is to significantly enhance compression efficiency and bandwidth utilization while maintaining high-quality user experiences. To address this challenge, the Just Noticeable Distortion (JND) estimation model, which leverages the perceptual characteristics of the Human Visual System (HVS), is widely used in image and video coding for improved data compression. However, human visual perception is an integrative process that involves both visual and auditory stimuli. Therefore, this paper investigates the influence of audio signals on visual perception and presents a collaborative audio–video JND estimation model tailored for multimedia applications. Specifically, we characterize audio loudness, duration, and energy as temporal perceptual features, while assigning the audio saliency superimposed on the image plane as the spatial perceptual feature. An audio JND adjustment factor is then designed using a segmentation function. Finally, the proposed model combines the video-based JND model with the audio JND adjustment factor to form the audio–video collaborative JND estimation model. Compared with existing JND models, the model presented in this paper achieves the best subjective quality, with an average PSNR value of 26.97 dB. The experimental results confirm that audio significantly impacts human visual perception. The proposed audio–video collaborative JND model effectively enhances the accuracy of JND estimation for multimedia data, thereby improving compression efficiency and maintaining high-quality user experiences.

随着互联网和多媒体技术的飞速发展,集成音频和视频的多媒体应用在日常生活和专业环境中越来越普遍。如何在保持高质量用户体验的同时大幅提高压缩效率和带宽利用率,是一项严峻的挑战。为了应对这一挑战,利用人类视觉系统(HVS)的感知特征的可察觉失真(JND)估计模型被广泛应用于图像和视频编码中,以改进数据压缩。然而,人类的视觉感知是一个涉及视觉和听觉刺激的综合过程。因此,本文研究了音频信号对视觉感知的影响,并提出了一种专为多媒体应用定制的音视频 JND 协作估算模型。具体来说,我们将音频响度、持续时间和能量作为时间感知特征,而将叠加在图像平面上的音频显著性作为空间感知特征。然后使用分割函数设计音频 JND 调整因子。最后,提出的模型将基于视频的 JND 模型与音频 JND 调整因子相结合,形成音视频协同 JND 估计模型。与现有的 JND 模型相比,本文提出的模型达到了最佳的主观质量,平均 PSNR 值为 26.97 dB。实验结果证实了音频对人类视觉感知的显著影响。本文提出的音视频协同 JND 模型有效提高了多媒体数据 JND 估计的准确性,从而提高了压缩效率,保持了高质量的用户体验。
{"title":"Audio–video collaborative JND estimation model for multimedia applications","authors":"","doi":"10.1016/j.jvcir.2024.104254","DOIUrl":"10.1016/j.jvcir.2024.104254","url":null,"abstract":"<div><p>With the rapid development of the Internet and multimedia technologies, multimedia applications integrating audio and video are becoming increasingly prevalent in both everyday life and professional environments. A critical challenge is to significantly enhance compression efficiency and bandwidth utilization while maintaining high-quality user experiences. To address this challenge, the Just Noticeable Distortion (JND) estimation model, which leverages the perceptual characteristics of the Human Visual System (HVS), is widely used in image and video coding for improved data compression. However, human visual perception is an integrative process that involves both visual and auditory stimuli. Therefore, this paper investigates the influence of audio signals on visual perception and presents a collaborative audio–video JND estimation model tailored for multimedia applications. Specifically, we characterize audio loudness, duration, and energy as temporal perceptual features, while assigning the audio saliency superimposed on the image plane as the spatial perceptual feature. An audio JND adjustment factor is then designed using a segmentation function. Finally, the proposed model combines the video-based JND model with the audio JND adjustment factor to form the audio–video collaborative JND estimation model. Compared with existing JND models, the model presented in this paper achieves the best subjective quality, with an average PSNR value of 26.97 dB. The experimental results confirm that audio significantly impacts human visual perception. The proposed audio–video collaborative JND model effectively enhances the accuracy of JND estimation for multimedia data, thereby improving compression efficiency and maintaining high-quality user experiences.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141978193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic attention guided low-light image enhancement with multi-scale perception 利用多尺度感知进行语义注意引导的弱光图像增强
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104242

Low-light environments often lead to complex degradation of captured images. However, most deep learning-based image enhancement methods for low-light conditions only learn a single-channel mapping relationship between the input image in low-light conditions and the desired image in normal light without considering semantic priors. This may cause the network to deviate from the original color of the region. In addition, deep network architectures are not suitable for low-light image recovery due to low pixel values. To address these issues, we propose a novel network called SAGNet. It consists of two branches:the main branch extracts global enhancement features at the level of the original image, and the other branch introduces semantic information through region-based feature learning and learns local enhancement features for semantic regions with multi-level perception to maintain color consistency. The extracted features are merged with the global enhancement features for semantic consistency and visualization. We also propose an unsupervised loss function to improve the network’s adaptability to general scenes and reduce the effect of sparse datasets. Extensive experiments and ablation studies show that SAGNet maintains color accuracy better in all cases and keeps natural luminance consistency across the semantic range.

弱光环境通常会导致所拍摄图像的复杂退化。然而,大多数基于深度学习的弱光条件图像增强方法仅学习弱光条件下输入图像与正常光线下所需图像之间的单通道映射关系,而不考虑语义先验。这可能会导致网络偏离区域的原始颜色。此外,由于像素值较低,深度网络架构并不适合弱光下的图像复原。为了解决这些问题,我们提出了一种名为 SAGNet 的新型网络。它由两个分支组成:主分支提取原始图像级别的全局增强特征,另一个分支通过基于区域的特征学习引入语义信息,并学习具有多级感知的语义区域的局部增强特征,以保持色彩一致性。提取的特征与全局增强特征合并,以实现语义一致性和可视化。我们还提出了一种无监督损失函数,以提高网络对一般场景的适应性,减少稀疏数据集的影响。广泛的实验和消融研究表明,SAGNet 在所有情况下都能更好地保持色彩准确性,并在整个语义范围内保持自然亮度的一致性。
{"title":"Semantic attention guided low-light image enhancement with multi-scale perception","authors":"","doi":"10.1016/j.jvcir.2024.104242","DOIUrl":"10.1016/j.jvcir.2024.104242","url":null,"abstract":"<div><p>Low-light environments often lead to complex degradation of captured images. However, most deep learning-based image enhancement methods for low-light conditions only learn a single-channel mapping relationship between the input image in low-light conditions and the desired image in normal light without considering semantic priors. This may cause the network to deviate from the original color of the region. In addition, deep network architectures are not suitable for low-light image recovery due to low pixel values. To address these issues, we propose a novel network called SAGNet. It consists of two branches:the main branch extracts global enhancement features at the level of the original image, and the other branch introduces semantic information through region-based feature learning and learns local enhancement features for semantic regions with multi-level perception to maintain color consistency. The extracted features are merged with the global enhancement features for semantic consistency and visualization. We also propose an unsupervised loss function to improve the network’s adaptability to general scenes and reduce the effect of sparse datasets. Extensive experiments and ablation studies show that SAGNet maintains color accuracy better in all cases and keeps natural luminance consistency across the semantic range.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141782634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-local feature aggregation quaternion network for single image deraining 用于单幅图像派生的非本地特征聚合四元数网络
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104250

The existing deraining methods are based on convolutional neural networks (CNN) learning the mapping relationship between rainy and clean images. However, the real-valued CNN processes the color images as three independent channels separately, which fails to fully leverage color information. Additionally, sliding-window-based neural networks cannot effectively model the non-local characteristics of an image. In this work, we proposed a non-local feature aggregation quaternion network (NLAQNet), which is composed of two concurrent sub-networks: the Quaternion Local Detail Repair Network (QLDRNet) and the Multi-Level Feature Aggregation Network (MLFANet). Furthermore, in the subnetwork of QLDRNet, the Local Detail Repair Block (LDRB) is proposed to repair the backdrop of an image that has not been damaged by rain streaks. Finally, within the MLFANet subnetwork, we have introduced two specialized blocks, namely the Non-Local Feature Aggregation Block (NLAB) and the Feature Aggregation Block (Mix), specifically designed to address the restoration of rain-streak-damaged image backgrounds. Extensive experiments demonstrate that the proposed network delivers strong performance in both qualitative and quantitative evaluations on existing datasets. The code is available at https://github.com/xionggonghe/NLAQNet.

现有的去污方法都是基于卷积神经网络(CNN)学习雨天图像和干净图像之间的映射关系。然而,实值神经网络将彩色图像作为三个独立通道分别处理,无法充分利用色彩信息。此外,基于滑动窗口的神经网络不能有效地模拟图像的非局部特征。在这项工作中,我们提出了一种非局部特征聚合四元数网络(NLAQNet),它由两个并发的子网络组成:四元数局部细节修复网络(QLDRNet)和多级特征聚合网络(MLFANet)。此外,在 QLDRNet 子网络中,还提出了局部细节修复块(LDRB),用于修复未被雨条纹破坏的图像背景。最后,在 MLFANet 子网络中,我们引入了两个专门的区块,即非局部特征聚合区块(NLAB)和特征聚合区块(Mix),专门用于修复受雨滴条纹破坏的图像背景。广泛的实验证明,在现有数据集的定性和定量评估中,所提出的网络都具有很强的性能。代码可在 https://github.com/xionggonghe/NLAQNet 上获取。
{"title":"Non-local feature aggregation quaternion network for single image deraining","authors":"","doi":"10.1016/j.jvcir.2024.104250","DOIUrl":"10.1016/j.jvcir.2024.104250","url":null,"abstract":"<div><p>The existing deraining methods are based on convolutional neural networks (CNN) learning the mapping relationship between rainy and clean images. However, the real-valued CNN processes the color images as three independent channels separately, which fails to fully leverage color information. Additionally, sliding-window-based neural networks cannot effectively model the non-local characteristics of an image. In this work, we proposed a non-local feature aggregation quaternion network (NLAQNet), which is composed of two concurrent sub-networks: the Quaternion Local Detail Repair Network (QLDRNet) and the Multi-Level Feature Aggregation Network (MLFANet). Furthermore, in the subnetwork of QLDRNet, the Local Detail Repair Block (LDRB) is proposed to repair the backdrop of an image that has not been damaged by rain streaks. Finally, within the MLFANet subnetwork, we have introduced two specialized blocks, namely the Non-Local Feature Aggregation Block (NLAB) and the Feature Aggregation Block (Mix), specifically designed to address the restoration of rain-streak-damaged image backgrounds. Extensive experiments demonstrate that the proposed network delivers strong performance in both qualitative and quantitative evaluations on existing datasets. The code is available at <span><span>https://github.com/xionggonghe/NLAQNet</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hypergraph clustering based multi-label cross-modal retrieval 基于超图聚类的多标签跨模态检索
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104258

Most existing cross-modal retrieval methods face challenges in establishing semantic connections between different modalities due to inherent heterogeneity among them. To establish semantic connections between different modalities and align relevant semantic features across modalities, so as to fully capture important information within the same modality, this paper considers the superiority of hypergraph in representing higher-order relationships, and proposes an image-text retrieval method based on hypergraph clustering. Specifically, we construct hypergraphs to capture feature relationships within image and text modalities, as well as between image and text. This allows us to effectively model complex relationships between features of different modalities and explore the semantic connectivity within and across modalities. To compensate for potential semantic feature loss during the construction of the hypergraph neural network, we design a weight-adaptive coarse and fine-grained feature fusion module for semantic supplementation. Comprehensive experimental results on three common datasets demonstrate the effectiveness of the proposed method.

由于不同模态之间固有的异质性,大多数现有的跨模态检索方法在建立不同模态之间的语义联系方面面临挑战。为了建立不同模态之间的语义联系,并对跨模态的相关语义特征进行排列,从而充分捕捉同一模态内的重要信息,本文考虑到超图在表示高阶关系方面的优越性,提出了一种基于超图聚类的图像-文本检索方法。具体来说,我们构建超图来捕捉图像和文本模式内以及图像和文本之间的特征关系。这使我们能够有效地模拟不同模态特征之间的复杂关系,并探索模态内部和模态之间的语义连接。为了弥补超图神经网络构建过程中可能出现的语义特征损失,我们设计了一个权重自适应的粗粒度和细粒度特征融合模块,用于语义补充。在三个常见数据集上的综合实验结果证明了所提方法的有效性。
{"title":"Hypergraph clustering based multi-label cross-modal retrieval","authors":"","doi":"10.1016/j.jvcir.2024.104258","DOIUrl":"10.1016/j.jvcir.2024.104258","url":null,"abstract":"<div><p>Most existing cross-modal retrieval methods face challenges in establishing semantic connections between different modalities due to inherent heterogeneity among them. To establish semantic connections between different modalities and align relevant semantic features across modalities, so as to fully capture important information within the same modality, this paper considers the superiority of hypergraph in representing higher-order relationships, and proposes an image-text retrieval method based on hypergraph clustering. Specifically, we construct hypergraphs to capture feature relationships within image and text modalities, as well as between image and text. This allows us to effectively model complex relationships between features of different modalities and explore the semantic connectivity within and across modalities. To compensate for potential semantic feature loss during the construction of the hypergraph neural network, we design a weight-adaptive coarse and fine-grained feature fusion module for semantic supplementation. Comprehensive experimental results on three common datasets demonstrate the effectiveness of the proposed method.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141993447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corrigendum to “Heterogeneity constrained color ellipsoid prior image dehazing algorithm” [J. Vis. Commun. Image Represent. 101 (2024) 104177] 对 "异质性约束彩色椭圆先验图像去毛刺算法 "的更正 [J. Vis. Commun. Image Represent.
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104235
{"title":"Corrigendum to “Heterogeneity constrained color ellipsoid prior image dehazing algorithm” [J. Vis. Commun. Image Represent. 101 (2024) 104177]","authors":"","doi":"10.1016/j.jvcir.2024.104235","DOIUrl":"10.1016/j.jvcir.2024.104235","url":null,"abstract":"","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1047320324001913/pdfft?md5=acb08692ca9b1d2f6bd84d46fa591d30&pid=1-s2.0-S1047320324001913-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141694814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DDR: A network of image deraining systems for dark environments DDR:黑暗环境下的图像衍生系统网络
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104244

In the domain of computer vision, addressing the degradation of image quality under adverse weather conditions remains a significant challenge. To tackle the challenges of image enhancement and deraining in dark settings, we have integrated image enhancement and deraining technologies to develop the DDR (Dark Environment Deraining Network) system. This specialized network is designed to enhance and clarify images in low-light conditions compromised by raindrops. DDR employs a strategic divide-and-conquer approach and an apt network selection to discern patterns of raindrops and background elements within images. It is capable of mitigating noise and blurring induced by raindrops in dark settings, thus enhancing the visual fidelity of images. Through testing on real-world imagery and the Rain LOL dataset, this innovative network offers a robust solution for deraining tasks in dark conditions, inspiring advancements in the performance of computer vision systems under challenging weather scenarios. The research of DDR provides technical and theoretical support for improving image quality in dark environment.

在计算机视觉领域,解决恶劣天气条件下图像质量下降的问题仍然是一项重大挑战。为了应对黑暗环境下图像增强和去污的挑战,我们整合了图像增强和去污技术,开发出了 DDR(黑暗环境去污网络)系统。这种专用网络旨在增强和澄清受雨滴影响的低照度条件下的图像。DDR 采用战略性的分而治之方法和适当的网络选择来辨别图像中的雨滴和背景元素模式。它能够在黑暗环境中减少雨滴引起的噪音和模糊,从而提高图像的视觉保真度。通过对真实世界图像和《Rain LOL》数据集的测试,这一创新网络为黑暗条件下的派生任务提供了强大的解决方案,从而推动了计算机视觉系统在具有挑战性的天气情况下的性能进步。DDR 的研究为提高黑暗环境中的图像质量提供了技术和理论支持。
{"title":"DDR: A network of image deraining systems for dark environments","authors":"","doi":"10.1016/j.jvcir.2024.104244","DOIUrl":"10.1016/j.jvcir.2024.104244","url":null,"abstract":"<div><p>In the domain of computer vision, addressing the degradation of image quality under adverse weather conditions remains a significant challenge. To tackle the challenges of image enhancement and deraining in dark settings, we have integrated image enhancement and deraining technologies to develop the DDR (Dark Environment Deraining Network) system. This specialized network is designed to enhance and clarify images in low-light conditions compromised by raindrops. DDR employs a strategic divide-and-conquer approach and an apt network selection to discern patterns of raindrops and background elements within images. It is capable of mitigating noise and blurring induced by raindrops in dark settings, thus enhancing the visual fidelity of images. Through testing on real-world imagery and the Rain LOL dataset, this innovative network offers a robust solution for deraining tasks in dark conditions, inspiring advancements in the performance of computer vision systems under challenging weather scenarios. The research of DDR provides technical and theoretical support for improving image quality in dark environment.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141782632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-capacity multi-MSB predictive reversible data hiding in encrypted domain for triangular mesh models 三角形网格模型加密域中的大容量多 MSB 预测可逆数据隐藏
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104246

Reversible data hiding in encrypted domain (RDH-ED) is widely used in sensitive fields such as privacy protection and copyright authentication. However, the embedding capacity of existing methods is generally low due to the insufficient use of model topology. In order to improve the embedding capacity, this paper proposes a high-capacity multi-MSB predictive reversible data hiding in encrypted domain (MMPRDH-ED). Firstly, the 3D model is subdivided by triangular mesh subdivision (TMS) algorithm, and its vertices are divided into reference set and embedded set. Then, in order to make full use of the redundant space of embedded vertices, Multi-MSB prediction (MMP) and Multi-layer Embedding Strategy (MLES) are used to improve the capacity. Finally, stream encryption technology is used to encrypt the model and data to ensure data security. The experimental results show that compared with the existing methods, the embedding capacity of MMPRDH-ED is increased by 53 %, which has higher advantages.

加密域中的可逆数据隐藏(RDH-ED)被广泛应用于隐私保护和版权认证等敏感领域。然而,由于没有充分利用模型拓扑,现有方法的嵌入能力普遍较低。为了提高嵌入能力,本文提出了一种高容量的加密域多MSB预测可逆数据隐藏(MMPRDH-ED)。首先,利用三角形网格细分算法(TMS)对三维模型进行细分,将其顶点分为参考集和嵌入集。然后,为了充分利用嵌入顶点的冗余空间,采用了多多字节预测(MMP)和多层嵌入策略(MLES)来提高容量。最后,采用流加密技术对模型和数据进行加密,确保数据安全。实验结果表明,与现有方法相比,MMPRDH-ED 的嵌入容量提高了 53%,具有更高的优势。
{"title":"High-capacity multi-MSB predictive reversible data hiding in encrypted domain for triangular mesh models","authors":"","doi":"10.1016/j.jvcir.2024.104246","DOIUrl":"10.1016/j.jvcir.2024.104246","url":null,"abstract":"<div><p>Reversible data hiding in encrypted domain (RDH-ED) is widely used in sensitive fields such as privacy protection and copyright authentication. However, the embedding capacity of existing methods is generally low due to the insufficient use of model topology. In order to improve the embedding capacity, this paper proposes a high-capacity multi-MSB predictive reversible data hiding in encrypted domain (MMPRDH-ED). Firstly, the 3D model is subdivided by triangular mesh subdivision (TMS) algorithm, and its vertices are divided into reference set and embedded set. Then, in order to make full use of the redundant space of embedded vertices, Multi-MSB prediction (MMP) and Multi-layer Embedding Strategy (MLES) are used to improve the capacity. Finally, stream encryption technology is used to encrypt the model and data to ensure data security. The experimental results show that compared with the existing methods, the embedding capacity of MMPRDH-ED is increased by 53 %, which has higher advantages.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141782637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Facial feature point detection under large range of face deformations 大范围人脸变形下的人脸特征点检测
IF 2.6 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-08-01 DOI: 10.1016/j.jvcir.2024.104264

Facial Feature Point Detection (FFPD) plays a significant role in several face analysis tasks such as feature extraction and classification. This paper presents a Fully Automatic FFPD system using the application of Random Forest Regression Voting in a Constrained Local Model (RFRV-CLM) framework. A global detector is used to find the approximate positions of the facial region and eye centers. A sequence of local RFRV-CLMs are used to locate a detailed set of points around the facial features. Both global and local models use Random Forest Regression to vote for optimal positions. The system is evaluated in the task of facial expression localization using five different facial expression databases of different characteristics including age, intensity, 6-basic expressions, 22 compound expressions, static and dynamic images, and deliberate and spontaneous expressions. Quantitative results of the evaluation of automatic point localization against manual points (ground truth) demonstrated that the results of the proposed approach are encouraging and outperform the results of alternative techniques tested on the same databases.

面部特征点检测(FFPD)在特征提取和分类等多项人脸分析任务中发挥着重要作用。本文介绍了一种全自动人脸特征点检测系统,该系统采用了受限局部模型中的随机森林回归投票(RFRV-CLM)框架。全局检测器用于找到面部区域和眼睛中心的大致位置。一系列局部 RFRV-CLM 用于定位面部特征周围的详细点集。全局和局部模型都使用随机森林回归来投票选出最佳位置。在面部表情定位任务中,使用五个不同的面部表情数据库对该系统进行了评估,这些数据库具有不同的特征,包括年龄、强度、6 种基本表情、22 种复合表情、静态和动态图像、刻意和自发表情。对照人工点(地面实况)对自动点定位的定量评估结果表明,所提出方法的结果令人鼓舞,优于在相同数据库上测试的其他技术的结果。
{"title":"Facial feature point detection under large range of face deformations","authors":"","doi":"10.1016/j.jvcir.2024.104264","DOIUrl":"10.1016/j.jvcir.2024.104264","url":null,"abstract":"<div><p>Facial Feature Point Detection (FFPD) plays a significant role in several face analysis tasks such as feature extraction and classification. This paper presents a Fully Automatic FFPD system using the application of Random Forest Regression Voting in a Constrained Local Model (RFRV-CLM) framework. A global detector is used to find the approximate positions of the facial region and eye centers. A sequence of local RFRV-CLMs are used to locate a detailed set of points around the facial features. Both global and local models use Random Forest Regression to vote for optimal positions. The system is evaluated in the task of facial expression localization using five different facial expression databases of different characteristics including age, intensity, 6-basic expressions, 22 compound expressions, static and dynamic images, and deliberate and spontaneous expressions. Quantitative results of the evaluation of automatic point localization against manual points (ground truth) demonstrated that the results of the proposed approach are encouraging and outperform the results of alternative techniques tested on the same databases.</p></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":null,"pages":null},"PeriodicalIF":2.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142021580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Visual Communication and Image Representation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1