Journal of Visual Communication and Image Representation最新文献

英文中文

Stochastic textures modeling and its application in texture structure decomposition

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-15 DOI: 10.1016/j.jvcir.2025.104411

Samah Khawaled , Yehoshua Y. Zeevi

Natural stochastic textures coexist in images with complementary edge-type structural elements that constitute the cartoon-type skeleton of an image. Separating texture from the structure of natural image is an important inverse problem in image analysis. In this decomposition, the textural layer, which conveys fine details and small-scale variations, is separated from the image macrostructures (edges and contours). We propose a variational texture-structure separation scheme. Our approach involves texture modeling by a stochastic field; The 2D fractional Brownian motion (fBm), a non-stationary Gaussian self-similar process, which is suitable model for pure natural stochastic textures. We use it as a reconstruction prior to extract the corresponding textural element and show that this separation is crucial for improving the execution of various image processing tasks such as image denoising. Lastly, we highlight how manifold-based representation of texture-structure data, can be implemented in extraction of geometric features and construction of a classification space.

引用次数: 0

RQVR: A multi-exposure image fusion network that optimizes rendering quality and visual realism

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-10 DOI: 10.1016/j.jvcir.2025.104410

Xiaokang Liu , Enlong Wang , Huizi Man , Shihua Zhou , Yueping Wang

Deep learning has made significant strides in multi-exposure image fusion in recent years. However, it is still challenging to maintain the integrity of texture details and illumination. This paper proposes a novel multi-exposure image fusion method to optimize Rendering Quality and Visual Realism (RQVR), addressing limitations in recovering details lost under extreme lighting conditions. The Contextual and Edge-aware Module (CAM) enhances image quality by balancing global features and local details, ensuring the texture details of fused images. To enhance the realism of visual effects, an Illumination Equalization Module (IEM) is designed to optimize light adjustment. Moreover, a fusion module (FM) is introduced to minimize information loss in the fused images. Comprehensive experiments conducted on two datasets demonstrate that our proposed method surpasses existing state-of-the-art techniques. The results show that our method not only attains substantial improvements in image quality but also outperforms most advanced techniques in terms of computational efficiency.

{"title":"RQVR: A multi-exposure image fusion network that optimizes rendering quality and visual realism","authors":"Xiaokang Liu , Enlong Wang , Huizi Man , Shihua Zhou , Yueping Wang","doi":"10.1016/j.jvcir.2025.104410","DOIUrl":"10.1016/j.jvcir.2025.104410","url":null,"abstract":"<div><div>Deep learning has made significant strides in multi-exposure image fusion in recent years. However, it is still challenging to maintain the integrity of texture details and illumination. This paper proposes a novel multi-exposure image fusion method to optimize Rendering Quality and Visual Realism (RQVR), addressing limitations in recovering details lost under extreme lighting conditions. The Contextual and Edge-aware Module (CAM) enhances image quality by balancing global features and local details, ensuring the texture details of fused images. To enhance the realism of visual effects, an Illumination Equalization Module (IEM) is designed to optimize light adjustment. Moreover, a fusion module (FM) is introduced to minimize information loss in the fused images. Comprehensive experiments conducted on two datasets demonstrate that our proposed method surpasses existing state-of-the-art techniques. The results show that our method not only attains substantial improvements in image quality but also outperforms most advanced techniques in terms of computational efficiency.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"107 ","pages":"Article 104410"},"PeriodicalIF":2.6,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Capsule network with using shifted windows for 3D human pose estimation

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-10 DOI: 10.1016/j.jvcir.2025.104409

Xiufeng Liu , Zhongqiu Zhao , Weidong Tian , Binbin Liu , Hongmei He

3D human pose estimation (HPE) is a vital technology with diverse applications, enhancing precision in tracking, analyzing, and understanding human movements. However, 3D HPE from monocular videos presents significant challenges, primarily due to self-occlusion, which can partially hinder traditional neural networks’ ability to accurately predict these positions. To address this challenge, we propose a novel approach using a capsule network integrated with the shifted windows attention model (SwinCAP). It improves prediction accuracy by effectively capturing the spatial hierarchical relationships between different parts and objects. A Parallel Double Attention mechanism is applied in SwinCAP enhances both computational efficiency and modeling capacity, and a Multi-Attention Collaborative module is introduced to capture a diverse range of information, including both coarse and fine details. Extensive experiments demonstrate that our SwinCAP achieves better or comparable results to state-of-the-art models in the challenging task of viewpoint transfer on two commonly used datasets: Human3.6M and MPI-INF-3DHP.

三维人体姿态估计（HPE）是一项具有多种应用的重要技术，可提高跟踪、分析和理解人体运动的精度。然而，从单目视频中进行三维人体姿态估算面临着巨大挑战，这主要是由于自闭塞现象会部分阻碍传统神经网络准确预测这些位置的能力。为了应对这一挑战，我们提出了一种使用胶囊网络与移窗注意力模型（SwinCAP）集成的新方法。它能有效捕捉不同部件和物体之间的空间层次关系，从而提高预测精度。SwinCAP 中采用的并行双倍注意力机制提高了计算效率和建模能力，并引入了多注意力协作模块，以捕捉包括粗细节和细细节在内的各种信息。广泛的实验证明，我们的 SwinCAP 在两个常用数据集上的视点转移这一具有挑战性的任务中取得了比最先进模型更好或相当的结果：Human3.6M 和 MPI-INF-3DHP。

{"title":"Capsule network with using shifted windows for 3D human pose estimation","authors":"Xiufeng Liu , Zhongqiu Zhao , Weidong Tian , Binbin Liu , Hongmei He","doi":"10.1016/j.jvcir.2025.104409","DOIUrl":"10.1016/j.jvcir.2025.104409","url":null,"abstract":"<div><div>3D human pose estimation (HPE) is a vital technology with diverse applications, enhancing precision in tracking, analyzing, and understanding human movements. However, 3D HPE from monocular videos presents significant challenges, primarily due to self-occlusion, which can partially hinder traditional neural networks’ ability to accurately predict these positions. To address this challenge, we propose a novel approach using a capsule network integrated with the shifted windows attention model (SwinCAP). It improves prediction accuracy by effectively capturing the spatial hierarchical relationships between different parts and objects. A Parallel Double Attention mechanism is applied in SwinCAP enhances both computational efficiency and modeling capacity, and a Multi-Attention Collaborative module is introduced to capture a diverse range of information, including both coarse and fine details. Extensive experiments demonstrate that our SwinCAP achieves better or comparable results to state-of-the-art models in the challenging task of viewpoint transfer on two commonly used datasets: Human3.6M and MPI-INF-3DHP.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"108 ","pages":"Article 104409"},"PeriodicalIF":2.6,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantic-guided face inpainting with subspace pyramid aggregation

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-08 DOI: 10.1016/j.jvcir.2025.104408

Yaqian Li, Xiumin Zhang, Cunjun Xiao

With the recent advancement of Generative Adversarial Networks, image inpainting has been improved, but the complexity of face structure makes face inpainting more challenging. The main reasons are attributed to two points: (1) the lack of geometry relation between facial features to synthesize fine textures, and (2) the difficulty of repairing occluded area based on known pixels at a distance, especially when the face is occluded over a large area. This paper proposes a face inpainting method based on semantic feature guidance and aggregated subspace pyramid module, where we use the semantic features of masked faces as the prior knowledge to guide the inpainting of masked areas. Besides, we propose an ASPM (Aggregated Subspace Pyramid Module), which aggregates contextual information from different receptive fields and allows the of capturing distant information. We do experiments on the CelebAMask-HQ dataset and the FlickrFaces-HQ dataset, qualitative and quantitative studies show that it surpasses state-of-the-art methods. Code is available at https://github.com/xiumin123/Face_ inpainting.

{"title":"Semantic-guided face inpainting with subspace pyramid aggregation","authors":"Yaqian Li, Xiumin Zhang, Cunjun Xiao","doi":"10.1016/j.jvcir.2025.104408","DOIUrl":"10.1016/j.jvcir.2025.104408","url":null,"abstract":"<div><div>With the recent advancement of Generative Adversarial Networks, image inpainting has been improved, but the complexity of face structure makes face inpainting more challenging. The main reasons are attributed to two points: (1) the lack of geometry relation between facial features to synthesize fine textures, and (2) the difficulty of repairing occluded area based on known pixels at a distance, especially when the face is occluded over a large area. This paper proposes a face inpainting method based on semantic feature guidance and aggregated subspace pyramid module, where we use the semantic features of masked faces as the prior knowledge to guide the inpainting of masked areas. Besides, we propose an ASPM (Aggregated Subspace Pyramid Module), which aggregates contextual information from different receptive fields and allows the of capturing distant information. We do experiments on the CelebAMask-HQ dataset and the FlickrFaces-HQ dataset, qualitative and quantitative studies show that it surpasses state-of-the-art methods. Code is available at <span><span>https://github.com/xiumin123/Face_</span><svg><path></path></svg></span> inpainting.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"108 ","pages":"Article 104408"},"PeriodicalIF":2.6,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing visibility in hazy conditions: A multimodal multispectral image dehazing approach

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-07 DOI: 10.1016/j.jvcir.2025.104407

Mohammad Mahdizadeh , Peng Ye , Shaoqing Zhao

Improving visibility in hazy conditions is crucial for many image processing applications. Traditional single-image dehazing methods rely heavily on recoverable details from RGB images, limiting their effectiveness in dense haze. To overcome this, we propose a novel multimodal multispectral approach combining hazy RGB and Near-Infrared (NIR) images. First, an initial haze reduction enhances the saturation of the RGB image. Then, feature mapping networks process both the NIR and dehazed RGB images. The resulting feature maps are fused using a cross-modal fusion strategy and processed through convolutional layers to reconstruct a haze-free image. Finally, fusing the integrated dehazed image with the NIR image mitigates over/under exposedness and improves overall quality. Our method outperforms state-of-the-art techniques on the EPFL dataset, achieving notable improvements across four key metrics. Specifically, it demonstrates a significant enhancement of 0.1932 in the FADE metric, highlighting its superior performance in terms of haze reduction and image quality. The code and implementation details are available at https://github.com/PaulMahdizadeh123/MultimodalDehazing.

{"title":"Enhancing visibility in hazy conditions: A multimodal multispectral image dehazing approach","authors":"Mohammad Mahdizadeh , Peng Ye , Shaoqing Zhao","doi":"10.1016/j.jvcir.2025.104407","DOIUrl":"10.1016/j.jvcir.2025.104407","url":null,"abstract":"<div><div>Improving visibility in hazy conditions is crucial for many image processing applications. Traditional single-image dehazing methods rely heavily on recoverable details from RGB images, limiting their effectiveness in dense haze. To overcome this, we propose a novel multimodal multispectral approach combining hazy RGB and Near-Infrared (NIR) images. First, an initial haze reduction enhances the saturation of the RGB image. Then, feature mapping networks process both the NIR and dehazed RGB images. The resulting feature maps are fused using a cross-modal fusion strategy and processed through convolutional layers to reconstruct a haze-free image. Finally, fusing the integrated dehazed image with the NIR image mitigates over/under exposedness and improves overall quality. Our method outperforms state-of-the-art techniques on the EPFL dataset, achieving notable improvements across four key metrics. Specifically, it demonstrates a significant enhancement of 0.1932 in the FADE metric, highlighting its superior performance in terms of haze reduction and image quality. The code and implementation details are available at <span><span>https://github.com/PaulMahdizadeh123/MultimodalDehazing</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"107 ","pages":"Article 104407"},"PeriodicalIF":2.6,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143376992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A two-step enhanced tensor denoising framework based on noise position prior and adaptive ring rank

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-07 DOI: 10.1016/j.jvcir.2025.104406

Boyuan Li , Yali Fan , Weidong Zhang , Yan Song

Recently, low-rank tensor recovery has garnered significant attention. Its objective is to recover a clean tensor from an observation tensor that has been corrupted. However, existing methods typically do not exploit the prior information of the noise’s position, and methods based on tensor ring decomposition also require a preset rank. In this paper, we propose a framework that leverages this prior information to transform the denoising problem into a complementary one, ultimately achieving effective tensor denoising. This framework consists of two steps: first, we apply an efficient denoising method to obtain the noise prior and identify the noise’s positions; second, we treat these positions as missing values and perform tensor ring completion. In the completion problem, we propose a tensor ring completion model with an adaptive rank incremental strategy, effectively addressing the preset rank problem. Our framework is implemented using the alternating direction method of multipliers (ADMM). Our method has been demonstrated to be superior through extensive experiments conducted on both synthetic and real data.

{"title":"A two-step enhanced tensor denoising framework based on noise position prior and adaptive ring rank","authors":"Boyuan Li , Yali Fan , Weidong Zhang , Yan Song","doi":"10.1016/j.jvcir.2025.104406","DOIUrl":"10.1016/j.jvcir.2025.104406","url":null,"abstract":"<div><div>Recently, low-rank tensor recovery has garnered significant attention. Its objective is to recover a clean tensor from an observation tensor that has been corrupted. However, existing methods typically do not exploit the prior information of the noise’s position, and methods based on tensor ring decomposition also require a preset rank. In this paper, we propose a framework that leverages this prior information to transform the denoising problem into a complementary one, ultimately achieving effective tensor denoising. This framework consists of two steps: first, we apply an efficient denoising method to obtain the noise prior and identify the noise’s positions; second, we treat these positions as missing values and perform tensor ring completion. In the completion problem, we propose a tensor ring completion model with an adaptive rank incremental strategy, effectively addressing the preset rank problem. Our framework is implemented using the alternating direction method of multipliers (ADMM). Our method has been demonstrated to be superior through extensive experiments conducted on both synthetic and real data.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"107 ","pages":"Article 104406"},"PeriodicalIF":2.6,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143402938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ACGC: Adaptive chrominance gamma correction for low-light image enhancement

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-05 DOI: 10.1016/j.jvcir.2025.104402

N. Severoglu, Y. Demir, N.H. Kaplan, S. Kucuk

Capturing high-quality images becomes challenging in low-light conditions, often resulting in underexposed and blurry images. Only a few works can address these problems simultaneously. This paper presents a low-light image enhancement scheme based on the Y-I-Q transform and bilateral filter in least squares, named ACGC. The method involves applying a pre-correction to the input image, followed by the Y-I-Q transform. The obtained Y component is separated into its low and high-frequency layers. Local gamma correction is applied to the low-frequency layers, followed by contrast limited adaptive histogram equalization (CLAHE), and these layers are added up to produce an enhanced Y component. The remaining I and Q components are also enhanced with local gamma correction to provide images with a more natural color. Finally, the inverse Y-I-Q transform is employed to create the enhanced image. The experimental results demonstrate that the proposed approach yields superior visual quality and more natural colors compared to the state-of-the-art methods.

引用次数: 0

Noise variances and regularization learning gradient descent network for image deconvolution

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-05 DOI: 10.1016/j.jvcir.2025.104391

Shengjiang Kong , Weiwei Wang , Yu Han , Xiangchu Feng

Existing image deblurring approaches usually assume uniform Additive White Gaussian Noise (AWGN). However, the noise in real-world images is generally non-uniform AWGN and exhibits variations across different images. This work presents a deep learning framework for image deblurring that addresses non-uniform AWGN. We introduce a novel data fitting term within a regularization framework to better handle noise variations. Using gradient descent algorithm, we learn the inverse covariance of the non-uniform AWGN, the gradient of the regularization term, and the gradient adjusting factor from data. To achieve this, we unroll the gradient descent iteration into an end-to-end trainable network, where, these components are parameterized by convolutional neural networks. The proposed model is called the noise variances and regularization learning gradient descent network (NRL-GDN). Its major advantage is that it can automatically deal with both uniform and non-uniform AWGN. Experimental results on synthetic and real-world images demonstrate its superiority over existing baselines.

{"title":"Noise variances and regularization learning gradient descent network for image deconvolution","authors":"Shengjiang Kong , Weiwei Wang , Yu Han , Xiangchu Feng","doi":"10.1016/j.jvcir.2025.104391","DOIUrl":"10.1016/j.jvcir.2025.104391","url":null,"abstract":"<div><div>Existing image deblurring approaches usually assume uniform Additive White Gaussian Noise (AWGN). However, the noise in real-world images is generally non-uniform AWGN and exhibits variations across different images. This work presents a deep learning framework for image deblurring that addresses non-uniform AWGN. We introduce a novel data fitting term within a regularization framework to better handle noise variations. Using gradient descent algorithm, we learn the inverse covariance of the non-uniform AWGN, the gradient of the regularization term, and the gradient adjusting factor from data. To achieve this, we unroll the gradient descent iteration into an end-to-end trainable network, where, these components are parameterized by convolutional neural networks. The proposed model is called the noise variances and regularization learning gradient descent network (NRL-GDN). Its major advantage is that it can automatically deal with both uniform and non-uniform AWGN. Experimental results on synthetic and real-world images demonstrate its superiority over existing baselines.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"107 ","pages":"Article 104391"},"PeriodicalIF":2.6,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143339499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GroupRF: Panoptic Scene Graph Generation with group relation tokens

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-04 DOI: 10.1016/j.jvcir.2025.104405

Hongyun Wang , Jiachen Li , Xiang Xiang , Qing Xie , Yanchun Ma , Yongjian Liu

Panoptic Scene Graph Generation (PSG) aims to predict a variety of relations between pairs of objects within an image, and indicate the objects by panoptic segmentation masks instead of bounding boxes. Existing PSG methods attempt to straightforwardly fuse the object tokens for relation prediction, thus failing to fully utilize the interaction between the pairwise objects. To address this problem, we propose a novel framework named Group RelationFormer (GroupRF) to capture the fine-grained inter-dependency among all instances. Our method introduce a set of learnable tokens termed group rln tokens, which exploit fine-grained contextual interaction between object tokens with multiple attentive relations. In the process of relation prediction, we adopt multiple triplets to take advantage of the fine-grained interaction included in group rln tokens. We conduct comprehensive experiments on OpenPSG dataset, which show that our method outperforms the previous state-of-the-art method. Furthermore, we also show the effectiveness of our framework by ablation studies. Our code is available at https://github.com/WHY-student/GroupRF.

{"title":"GroupRF: Panoptic Scene Graph Generation with group relation tokens","authors":"Hongyun Wang , Jiachen Li , Xiang Xiang , Qing Xie , Yanchun Ma , Yongjian Liu","doi":"10.1016/j.jvcir.2025.104405","DOIUrl":"10.1016/j.jvcir.2025.104405","url":null,"abstract":"<div><div>Panoptic Scene Graph Generation (PSG) aims to predict a variety of relations between pairs of objects within an image, and indicate the objects by panoptic segmentation masks instead of bounding boxes. Existing PSG methods attempt to straightforwardly fuse the object tokens for relation prediction, thus failing to fully utilize the interaction between the pairwise objects. To address this problem, we propose a novel framework named <strong>Group R</strong>elation<strong>F</strong>ormer (GroupRF) to capture the fine-grained inter-dependency among all instances. Our method introduce a set of learnable tokens termed group rln tokens, which exploit fine-grained contextual interaction between object tokens with multiple attentive relations. In the process of relation prediction, we adopt multiple triplets to take advantage of the fine-grained interaction included in group rln tokens. We conduct comprehensive experiments on OpenPSG dataset, which show that our method outperforms the previous state-of-the-art method. Furthermore, we also show the effectiveness of our framework by ablation studies. Our code is available at <span><span>https://github.com/WHY-student/GroupRF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"107 ","pages":"Article 104405"},"PeriodicalIF":2.6,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143350262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing low-light color image visibility with hybrid contrast and saturation modification using a saturation-aware map

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Visual Communication and Image Representation

Pub Date : 2025-02-01 DOI: 10.1016/j.jvcir.2025.104392

Sepideh Khormaeipour, Fatemeh Shakeri

In this paper, we present a two-stage technique for color image enhancement. In the first stage, we apply the well-established Histogram Equalization method to enhance the overall contrast of the image. This is followed by a local enhancement method to address the differences in average local contrast between the original and enhanced images. In the second stage, we introduce a novel weighted map within a variational framework to adjust the saturation of the contrast-enhanced image. This weighted map identifies regions that require saturation modification and enables a controllable level of adjustment. The map is then multiplied by a maximally saturated color image derived from the original image, and the result is merged with the contrast-enhanced image. Compared to the original low-light image, our method significantly improves image quality, structure, color preservation, and saturation. Additionally, numerical experiments demonstrate that the proposed method outperforms other enhancement techniques in both qualitative and quantitative evaluations.

在本文中，我们提出了一种两阶段的彩色图像增强技术。在第一阶段，我们采用成熟的直方图均衡法来增强图像的整体对比度。随后，我们采用局部增强方法来解决原始图像和增强图像之间平均局部对比度的差异。在第二阶段，我们在变异框架内引入了一种新的加权图，以调整对比度增强图像的饱和度。这种加权图可以识别出需要修改饱和度的区域，并实现可控的调整水平。然后将该图乘以从原始图像中提取的最大饱和度彩色图像，并将结果与对比度增强图像合并。与原始低照度图像相比，我们的方法大大提高了图像质量、结构、色彩保存和饱和度。此外，数值实验证明，所提出的方法在定性和定量评估方面都优于其他增强技术。

{"title":"Enhancing low-light color image visibility with hybrid contrast and saturation modification using a saturation-aware map","authors":"Sepideh Khormaeipour, Fatemeh Shakeri","doi":"10.1016/j.jvcir.2025.104392","DOIUrl":"10.1016/j.jvcir.2025.104392","url":null,"abstract":"<div><div>In this paper, we present a two-stage technique for color image enhancement. In the first stage, we apply the well-established Histogram Equalization method to enhance the overall contrast of the image. This is followed by a local enhancement method to address the differences in average local contrast between the original and enhanced images. In the second stage, we introduce a novel weighted map within a variational framework to adjust the saturation of the contrast-enhanced image. This weighted map identifies regions that require saturation modification and enables a controllable level of adjustment. The map is then multiplied by a maximally saturated color image derived from the original image, and the result is merged with the contrast-enhanced image. Compared to the original low-light image, our method significantly improves image quality, structure, color preservation, and saturation. Additionally, numerical experiments demonstrate that the proposed method outperforms other enhancement techniques in both qualitative and quantitative evaluations.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"107 ","pages":"Article 104392"},"PeriodicalIF":2.6,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143360451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Visual Communication and Image Representation

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀