Computer Graphics Forum最新文献

英文中文

SOD-diffusion: Salient Object Detection via Diffusion-Based Image Generators SOD-diffusion：通过基于扩散的图像生成器进行突出物体检测

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15251

Shuo Zhang, Jiaming Huang, Shizhe Chen, Yan Wu, Tao Hu, Jing Liu

Salient Object Detection (SOD) is a challenging task that aims to precisely identify and segment the salient objects. However, existing SOD methods still face challenges in making explicit predictions near the edges and often lack end-to-end training capabilities. To alleviate these problems, we propose SOD-diffusion, a novel framework that formulates salient object detection as a denoising diffusion process from noisy masks to object masks. Specifically, object masks diffuse from ground-truth masks to random distribution in latent space, and the model learns to reverse this noising process to reconstruct object masks. To enhance the denoising learning process, we design an attention feature interaction module (AFIM) and a specific fine-tuning protocol to integrate conditional semantic features from the input image with diffusion noise embedding. Extensive experiments on five widely used SOD benchmark datasets demonstrate that our proposed SOD-diffusion achieves favorable performance compared to previous well-established methods. Furthermore, leveraging the outstanding generalization capability of SOD-diffusion, we applied it to publicly available images, generating high-quality masks that serve as an additional SOD benchmark testset.

突出物体检测（SOD）是一项具有挑战性的任务，旨在精确识别和分割突出物体。然而，现有的 SOD 方法在对边缘进行明确预测方面仍然面临挑战，而且往往缺乏端到端的训练能力。为了缓解这些问题，我们提出了 SOD 扩散方法，这是一种新颖的框架，它将突出物体检测表述为从噪声掩模到物体掩模的去噪扩散过程。具体来说，物体掩码从地面实况掩码扩散到潜空间的随机分布，模型学会逆转这一噪声过程以重建物体掩码。为了增强去噪学习过程，我们设计了一个注意力特征交互模块（AFIM）和一个特定的微调协议，以整合输入图像中的条件语义特征和扩散噪声嵌入。在五个广泛使用的 SOD 基准数据集上进行的广泛实验表明，与之前成熟的方法相比，我们提出的 SOD 扩散方法取得了良好的性能。此外，利用 SOD 扩散出色的泛化能力，我们将其应用于公开图像，生成了高质量的掩码，作为额外的 SOD 基准测试集。

{"title":"SOD-diffusion: Salient Object Detection via Diffusion-Based Image Generators","authors":"Shuo Zhang, Jiaming Huang, Shizhe Chen, Yan Wu, Tao Hu, Jing Liu","doi":"10.1111/cgf.15251","DOIUrl":"https://doi.org/10.1111/cgf.15251","url":null,"abstract":"Salient Object Detection (SOD) is a challenging task that aims to precisely identify and segment the salient objects. However, existing SOD methods still face challenges in making explicit predictions near the edges and often lack end-to-end training capabilities. To alleviate these problems, we propose SOD-diffusion, a novel framework that formulates salient object detection as a denoising diffusion process from noisy masks to object masks. Specifically, object masks diffuse from ground-truth masks to random distribution in latent space, and the model learns to reverse this noising process to reconstruct object masks. To enhance the denoising learning process, we design an attention feature interaction module (AFIM) and a specific fine-tuning protocol to integrate conditional semantic features from the input image with diffusion noise embedding. Extensive experiments on five widely used SOD benchmark datasets demonstrate that our proposed SOD-diffusion achieves favorable performance compared to previous well-established methods. Furthermore, leveraging the outstanding generalization capability of SOD-diffusion, we applied it to publicly available images, generating high-quality masks that serve as an additional SOD benchmark testset.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A TransISP Based Image Enhancement Method for Visual Disbalance in Low-light Images 基于 TransISP 的弱光图像视觉失衡增强方法

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15209

Jiaqi Wu, Jing Guo, Rui Jing, Shihao Zhang, Zijian Tian, Wei Chen, Zehua Wang

Existing image enhancement algorithms often fail to effectively address issues of visual disbalance, such as brightness unevenness and color distortion, in low-light images. To overcome these challenges, we propose a TransISP-based image enhancement method specifically designed for low-light images. To mitigate color distortion, we design dual encoders based on decoupled representation learning, which enable complete decoupling of the reflection and illumination components, thereby preventing mutual interference during the image enhancement process. To address brightness unevenness, we introduce CNNformer, a hybrid model combining CNN and Transformer. This model efficiently captures local details and long-distance dependencies between pixels, contributing to the enhancement of brightness features across various local regions. Additionally, we integrate traditional image signal processing algorithms to achieve efficient color correction and denoising of the reflection component. Furthermore, we employ a generative adversarial network (GAN) as the overarching framework to facilitate unsupervised learning. The experimental results show that, compared with six SOTA image enhancement algorithms, our method obtains significant improvement in evaluation indexes (e.g., on LOL, PSNR: 15.59%, SSIM: 9.77%, VIF: 9.65%), and it can improve visual disbalance defects in low-light images captured from real-world coal mine underground scenarios.

现有的图像增强算法往往无法有效解决低照度图像中的视觉失衡问题，如亮度不均和色彩失真。为了克服这些挑战，我们提出了一种基于 TransISP 的图像增强方法，专门针对弱光图像而设计。为了减轻色彩失真，我们设计了基于解耦表示学习的双编码器，实现了反射和照明成分的完全解耦，从而防止了图像增强过程中的相互干扰。为了解决亮度不均匀问题，我们引入了 CNNformer，这是一种结合了 CNN 和 Transformer 的混合模型。该模型能有效捕捉局部细节和像素间的远距离依赖关系，有助于增强各局部区域的亮度特征。此外，我们还整合了传统的图像信号处理算法，以实现高效的色彩校正和反射成分去噪。此外，我们还采用了生成式对抗网络（GAN）作为总体框架，以促进无监督学习。实验结果表明，与六种 SOTA 图像增强算法相比，我们的方法在评价指标上获得了显著改善（例如，在 LOL 上，PSNR：15.59%，SSIM：9.77%，VIF：9.65%），并能改善真实世界煤矿井下低照度图像中的视觉失衡缺陷。

{"title":"A TransISP Based Image Enhancement Method for Visual Disbalance in Low-light Images","authors":"Jiaqi Wu, Jing Guo, Rui Jing, Shihao Zhang, Zijian Tian, Wei Chen, Zehua Wang","doi":"10.1111/cgf.15209","DOIUrl":"https://doi.org/10.1111/cgf.15209","url":null,"abstract":"Existing image enhancement algorithms often fail to effectively address issues of visual disbalance, such as brightness unevenness and color distortion, in low-light images. To overcome these challenges, we propose a TransISP-based image enhancement method specifically designed for low-light images. To mitigate color distortion, we design dual encoders based on decoupled representation learning, which enable complete decoupling of the reflection and illumination components, thereby preventing mutual interference during the image enhancement process. To address brightness unevenness, we introduce CNNformer, a hybrid model combining CNN and Transformer. This model efficiently captures local details and long-distance dependencies between pixels, contributing to the enhancement of brightness features across various local regions. Additionally, we integrate traditional image signal processing algorithms to achieve efficient color correction and denoising of the reflection component. Furthermore, we employ a generative adversarial network (GAN) as the overarching framework to facilitate unsupervised learning. The experimental results show that, compared with six SOTA image enhancement algorithms, our method obtains significant improvement in evaluation indexes (e.g., on LOL, PSNR: 15.59%, SSIM: 9.77%, VIF: 9.65%), and it can improve visual disbalance defects in low-light images captured from real-world coal mine underground scenarios.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Surface Cutting and Flattening to Target Shapes 根据目标形状进行表面切割和压平

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15223

Yuanhao Li, Wenzheng Wu, Ligang Liu

We introduce a novel framework for surface cutting and flattening, aiming to align the boundary of planar parameterization with a target shape. Diverging from traditional methods focused on minimizing distortion, we intend to also achieve shape similarity between the parameterized mesh and a specific planar target, which is important in some applications of art design and texture mapping. However, with existing methods commonly limited to ellipsoidal surfaces, it still remains a challenge to solve this problem on general surfaces. Our framework models the general case as a joint optimization of cuts and parameterization, guided by a novel metric assessing shape similarity. To circumvent the common issue of local minima, we introduce an extra global seam updating strategy which is guided by the target shape. Experimental results show that our framework not only aligns with previous approaches on ellipsoidal surfaces but also achieves satisfactory results on more complex ones.

我们介绍了一种新颖的曲面切割和扁平化框架，旨在使平面参数化的边界与目标形状保持一致。有别于只关注变形最小化的传统方法，我们还打算实现参数化网格与特定平面目标之间的形状相似性，这在艺术设计和纹理映射的某些应用中非常重要。然而，现有的方法通常局限于椭圆形表面，要解决一般表面上的这一问题仍是一个挑战。我们的框架将一般情况建模为切割和参数化的联合优化，并以评估形状相似性的新指标为指导。为了规避常见的局部最小值问题，我们引入了额外的全局接缝更新策略，该策略以目标形状为导向。实验结果表明，我们的框架不仅在椭圆曲面上与之前的方法一致，而且在更复杂的曲面上也取得了令人满意的结果。

引用次数: 0

Adversarial Unsupervised Domain Adaptation for 3D Semantic Segmentation with 2D Image Fusion of Dense Depth 利用密集深度的二维图像融合进行三维语义分割的对抗性无监督领域自适应

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15250

Xindan Zhang, Ying Li, Huankun Sheng, Xinnian Zhang

Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multi-modal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2 → SemanticKITTI), Day-to-Night, country-to-country (USA → Singapore), and synthetic-to-real (VirtualKITTI → SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.

无监督领域适应（UDA）由于能够解决新领域标签缺失的问题，越来越多地被用于三维点云语义分割任务。然而，现有的大多数无监督域适应方法只关注单模态数据，很少应用于多模态数据。因此，我们在包含三维点云和二维图像的多模态数据集上提出了一种用于三维语义分割的跨模态 UDA。具体来说，我们首先提出了基于双判别器的领域适应（Dd-bDA）模块，以增强不同领域的适应性。其次，鉴于深度信息对域偏移的鲁棒性可以为语义分割提供更多细节，我们进一步采用了密集深度特征融合（DdFF）模块来提取具有丰富深度线索的图像特征。我们在四种无监督领域适应场景中评估了我们的模型，即数据集到数据集（A2D2 → SemanticKITTI）、白天到黑夜、国家到国家（美国 → 新加坡）以及合成到真实（VirtualKITTI → SemanticKITTI）。在所有设置中，实验结果都取得了显著的改进，并超越了最先进的模型。

{"title":"Adversarial Unsupervised Domain Adaptation for 3D Semantic Segmentation with 2D Image Fusion of Dense Depth","authors":"Xindan Zhang, Ying Li, Huankun Sheng, Xinnian Zhang","doi":"10.1111/cgf.15250","DOIUrl":"https://doi.org/10.1111/cgf.15250","url":null,"abstract":"Unsupervised domain adaptation (UDA) is increasingly used for 3D point cloud semantic segmentation tasks due to its ability to address the issue of missing labels for new domains. However, most existing unsupervised domain adaptation methods focus only on uni-modal data and are rarely applied to multi-modal data. Therefore, we propose a cross-modal UDA on multi-modal datasets that contain 3D point clouds and 2D images for 3D Semantic Segmentation. Specifically, we first propose a Dual discriminator-based Domain Adaptation (Dd-bDA) module to enhance the adaptability of different domains. Second, given that the robustness of depth information to domain shifts can provide more details for semantic segmentation, we further employ a Dense depth Feature Fusion (DdFF) module to extract image features with rich depth cues. We evaluate our model in four unsupervised domain adaptation scenarios, i.e., dataset-to-dataset (A2D2 → SemanticKITTI), Day-to-Night, country-to-country (USA → Singapore), and synthetic-to-real (VirtualKITTI → SemanticKITTI). In all settings, the experimental results achieve significant improvements and surpass state-of-the-art models.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Symmetric Piecewise Developable Approximations 对称分片可展开近似法

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15242

Ying He, Qing Fang, Zheng Zhang, Tielin Dai, Kang Wu, Ligang Liu, Xiao-Ming Fu

We propose a novel method for generating symmetric piecewise developable approximations for shapes in approximately global reflectional or rotational symmetry. Given a shape and its symmetry constraint, the algorithm contains two crucial steps: (i) a symmetric deformation to achieve a nearly developable model and (ii) a symmetric segmentation aided by the deformed shape. The key to the deformation step is the use of the symmetric implicit neural representations of the shape and the deformation field. A new mesh extraction from the implicit function is introduced to construct a strictly symmetric mesh for the subsequent segmentation. The symmetry constraint is carefully integrated into the partition to achieve the symmetric piecewise developable approximation. We demonstrate the effectiveness of our algorithm over various meshes.

我们提出了一种新方法，用于生成近似全局反射对称或旋转对称的对称片状可展开近似图形。给定一个形状及其对称约束，该算法包含两个关键步骤：(i) 对称变形以获得近似可展开模型；(ii) 在变形形状的辅助下进行对称分割。变形步骤的关键是使用形状和变形场的对称隐式神经表征。从隐式函数中引入新的网格提取，为后续的分割构建严格对称的网格。对称约束被仔细整合到分割中，以实现对称的片状可展开近似。我们在各种网格上演示了我们算法的有效性。

引用次数: 0

GS-Octree: Octree-based 3D Gaussian Splatting for Robust Object-level 3D Reconstruction Under Strong Lighting GS-Octree：基于八叉树的三维高斯拼接技术，用于强光下稳健的物体级三维重建

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15206

J. Li, Z. Wen, L. Zhang, J. Hu, F. Hou, Z. Zhang, Y. He

The 3D Gaussian Splatting technique has significantly advanced the construction of radiance fields from multi-view images, enabling real-time rendering. While point-based rasterization effectively reduces computational demands for rendering, it often struggles to accurately reconstruct the geometry of the target object, especially under strong lighting conditions. Strong lighting can cause significant color variations on the object's surface when viewed from different directions, complicating the reconstruction process. To address this challenge, we introduce an approach that combines octree-based implicit surface representations with Gaussian Splatting. Initially, it reconstructs a signed distance field (SDF) and a radiance field through volume rendering, encoding them in a low-resolution octree. This initial SDF represents the coarse geometry of the target object. Subsequently, it introduces 3D Gaussians as additional degrees of freedom, which are guided by the initial SDF. In the third stage, the optimized Gaussians enhance the accuracy of the SDF, enabling the recovery of finer geometric details compared to the initial SDF. Finally, the refined SDF is used to further optimize the 3D Gaussians via splatting, eliminating those that contribute little to the visual appearance. Experimental results show that our method, which leverages the distribution of 3D Gaussians with SDFs, reconstructs more accurate geometry, particularly in images with specular highlights caused by strong lighting. The source code can be downloaded from https://github.com/LaoChui999/GS-Octree.

三维高斯拼接技术大大推进了从多视角图像构建辐射场的工作，使实时渲染成为可能。虽然基于点的光栅化技术能有效降低渲染的计算需求，但它往往难以准确重建目标物体的几何形状，尤其是在强光条件下。从不同方向观看物体时，强烈的光照会导致物体表面出现明显的颜色变化，从而使重建过程变得更加复杂。为了应对这一挑战，我们引入了一种将基于八度的隐式表面表示与高斯拼接相结合的方法。首先，它通过体积渲染重建有符号的距离场（SDF）和辐射场，并将其编码为低分辨率的八叉树。这个初始 SDF 表示目标物体的粗略几何形状。随后，在初始 SDF 的引导下，引入三维高斯作为附加自由度。在第三阶段，优化后的高斯增强了 SDF 的精度，与初始 SDF 相比，能够恢复更精细的几何细节。最后，细化的 SDF 用于通过拼接进一步优化三维高斯，剔除那些对视觉外观贡献不大的高斯。实验结果表明，我们的方法利用了三维高斯与 SDF 的分布，能重建更精确的几何图形，尤其是在强光照造成的镜面高光的图像中。源代码可从 https://github.com/LaoChui999/GS-Octree 下载。

{"title":"GS-Octree: Octree-based 3D Gaussian Splatting for Robust Object-level 3D Reconstruction Under Strong Lighting","authors":"J. Li, Z. Wen, L. Zhang, J. Hu, F. Hou, Z. Zhang, Y. He","doi":"10.1111/cgf.15206","DOIUrl":"https://doi.org/10.1111/cgf.15206","url":null,"abstract":"The 3D Gaussian Splatting technique has significantly advanced the construction of radiance fields from multi-view images, enabling real-time rendering. While point-based rasterization effectively reduces computational demands for rendering, it often struggles to accurately reconstruct the geometry of the target object, especially under strong lighting conditions. Strong lighting can cause significant color variations on the object's surface when viewed from different directions, complicating the reconstruction process. To address this challenge, we introduce an approach that combines octree-based implicit surface representations with Gaussian Splatting. Initially, it reconstructs a signed distance field (SDF) and a radiance field through volume rendering, encoding them in a low-resolution octree. This initial SDF represents the coarse geometry of the target object. Subsequently, it introduces 3D Gaussians as additional degrees of freedom, which are guided by the initial SDF. In the third stage, the optimized Gaussians enhance the accuracy of the SDF, enabling the recovery of finer geometric details compared to the initial SDF. Finally, the refined SDF is used to further optimize the 3D Gaussians via splatting, eliminating those that contribute little to the visual appearance. Experimental results show that our method, which leverages the distribution of 3D Gaussians with SDFs, reconstructs more accurate geometry, particularly in images with specular highlights caused by strong lighting. The source code can be downloaded from https://github.com/LaoChui999/GS-Octree.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Faster Ray Tracing through Hierarchy Cut Code 通过层次剪切代码实现更快的光线跟踪

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15226

WeiLai Xiang, FengQi Liu, Zaonan Tan, Dan Li, PengZhan Xu, MeiZhi Liu, QiLong Kou

We propose a novel ray reordering technique designed to accelerate the ray tracing process by encoding and sorting rays prior to traversal. Our method, called “hierarchy cut code”, involves encoding rays based on the cuts of the hierarchical acceleration structure, rather than relying solely on spatial coordinates. This approach allows for a more effective adaptation to the acceleration structure, resulting in a more reliable and efficient encoding outcome. Furthermore, our research identifies “bounding drift” as a major obstacle in achieving better acceleration effects using longer sorting keys in existing reordering methods. Fortunately, our hierarchy cut code successfully overcomes this issue, providing improved performance in ray tracing. Experimental results demonstrate the effectiveness of our approach, showing up to a 1.81 times faster secondary ray tracing compared to existing methods. These promising results highlight the potential for further enhancement in the acceleration effect of reordering techniques, warranting further exploration and research in this exciting field.

我们提出了一种新颖的光线重排序技术，旨在通过在遍历之前对光线进行编码和排序来加速光线追踪过程。我们的方法被称为 "分层剪切代码"，包括根据分层加速结构的剪切对光线进行编码，而不是仅仅依赖空间坐标。这种方法能更有效地适应加速结构，从而获得更可靠、更高效的编码结果。此外，我们的研究发现，"边界漂移 "是现有重排序方法中使用较长排序键实现更好加速效果的主要障碍。幸运的是，我们的分层切割代码成功克服了这一问题，提高了光线追踪的性能。实验结果证明了我们方法的有效性，与现有方法相比，二次光线追踪的速度提高了 1.81 倍。这些充满希望的结果凸显了重新排序技术进一步增强加速效果的潜力，值得在这一激动人心的领域进行进一步的探索和研究。

引用次数: 0

Disentangled Lifespan Synthesis via Transformer-Based Nonlinear Regression 通过基于变压器的非线性回归进行分解寿命合成

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15229

Mingyuan Li, Yingchun Guo

Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-Disentangled Lifespan face Aging (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W⁺ space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W⁺ space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W⁺ space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.

生命周期面部年龄变换的目的是生成能准确描绘个人在不同年龄阶段外貌的面部图像。这项任务极具挑战性，因为需要在保留身份特征的同时合理改变面部特征。现有的方法往往合成出不令人满意的结果，如面部属性纠缠不清和身份保留率低，尤其是在处理较大的年龄差距时。此外，对风格向量的过度操作可能会使其偏离潜在空间，从而损害图像质量。为了解决这些问题，本文引入了一个新颖的非线性回归模型--Disentangled Lifespan face Aging（DL-Aging），以实现高质量的年龄转换图像。具体来说，我们提出了一种年龄调制编码器，以提取与年龄相关的多尺度面部特征作为键和值，并将重建后的图像样式向量作为查询。利用 W+ 空间中的多头交叉关注来迭代更新老化图像重建的查询。这种非线性变换能使模型学习到更多的分解变换模式，这对减轻面部属性纠缠至关重要。此外，我们还引入了 W+ 空间年龄正则项，以防止对风格向量的过度操作，并确保其在转换过程中保持在 W+ 空间内，从而提高生成质量和老化准确性。广泛的定性和定量实验证明，所提出的 DL-Aging 在老化准确性、图像质量、属性纠缠和身份保持方面都优于最先进的方法，尤其是在年龄差距较大的情况下。

{"title":"Disentangled Lifespan Synthesis via Transformer-Based Nonlinear Regression","authors":"Mingyuan Li, Yingchun Guo","doi":"10.1111/cgf.15229","DOIUrl":"https://doi.org/10.1111/cgf.15229","url":null,"abstract":"Lifespan face age transformation aims to generate facial images that accurately depict an individual's appearance at different age stages. This task is highly challenging due to the need for reasonable changes in facial features while preserving identity characteristics. Existing methods tend to synthesize unsatisfactory results, such as entangled facial attributes and low identity preservation, especially when dealing with large age gaps. Furthermore, over-manipulating the style vector may deviate it from the latent space and damage image quality. To address these issues, this paper introduces a novel nonlinear regression model-Disentangled Lifespan face Aging (DL-Aging) to achieve high-quality age transformation images. Specifically, we propose an age modulation encoder to extract age-related multi-scale facial features as key and value, and use the reconstructed style vector of the image as the query. The multi-head cross-attention in the W+ space is utilized to update the query for aging image reconstruction iteratively. This nonlinear transformation enables the model to learn a more disentangled mode of transformation, which is crucial for alleviating facial attribute entanglement. Additionally, we introduce a W+ space age regularization term to prevent excessive manipulation of the style vector and ensure it remains within the W+ space during transformation, thereby improving generation quality and aging accuracy. Extensive qualitative and quantitative experiments demonstrate that the proposed DL-Aging outperforms state-of-the-art methods regarding aging accuracy, image quality, attribute disentanglement, and identity preservation, especially for large age gaps.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Density-Aware Diffusion Model for Efficient Image Dehazing 用于高效图像去重的密度感知扩散模型

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15221

Ling Zhang, Wenxu Bai, Chunxia Xiao

Existing image dehazing methods have made remarkable progress. However, they generally perform poorly on images with dense haze, and often suffer from unsatisfactory results with detail degradation or color distortion. In this paper, we propose a density-aware diffusion model (DADM) for image dehazing. Guided by the haze density, our DADM can handle images with dense haze and complex environments. Specifically, we introduce a density-aware dehazing network (DADNet) in the reverse diffusion process, which can help DADM gradually recover a clear haze-free image from a haze image. To improve the performance of the network, we design a cross-feature density extraction module (CDEModule) to extract the haze density for the image and a density-guided feature fusion block (DFFBlock) to learn the effective contextual features. Furthermore, we introduce an indirect sampling strategy in the test sampling process, which not only suppresses the accumulation of errors but also ensures the stability of the results. Extensive experiments on popular benchmarks validate the superior performance of the proposed method. The code is released in https://github.com/benchacha/DADM.

现有的图像去毛刺方法已经取得了显著的进步。然而，这些方法在处理雾度较高的图像时通常表现不佳，而且经常出现细节退化或色彩失真等令人不满意的结果。在本文中，我们提出了一种用于图像去毛刺的密度感知扩散模型（DADM）。在雾霾密度的指导下，我们的 DADM 可以处理雾霾密集和环境复杂的图像。具体来说，我们在反向扩散过程中引入了密度感知去雾网络（DADNet），它可以帮助 DADM 从雾霾图像中逐步恢复出清晰的无雾霾图像。为了提高该网络的性能，我们设计了一个交叉特征密度提取模块（CDEModule）来提取图像的雾霾密度，并设计了一个密度引导特征融合模块（DFFBlock）来学习有效的上下文特征。此外，我们还在测试采样过程中引入了间接采样策略，这不仅抑制了误差的积累，还确保了结果的稳定性。在流行基准上进行的大量实验验证了所提方法的优越性能。代码发布于 https://github.com/benchacha/DADM。

{"title":"Density-Aware Diffusion Model for Efficient Image Dehazing","authors":"Ling Zhang, Wenxu Bai, Chunxia Xiao","doi":"10.1111/cgf.15221","DOIUrl":"https://doi.org/10.1111/cgf.15221","url":null,"abstract":"Existing image dehazing methods have made remarkable progress. However, they generally perform poorly on images with dense haze, and often suffer from unsatisfactory results with detail degradation or color distortion. In this paper, we propose a density-aware diffusion model (DADM) for image dehazing. Guided by the haze density, our DADM can handle images with dense haze and complex environments. Specifically, we introduce a density-aware dehazing network (DADNet) in the reverse diffusion process, which can help DADM gradually recover a clear haze-free image from a haze image. To improve the performance of the network, we design a cross-feature density extraction module (CDEModule) to extract the haze density for the image and a density-guided feature fusion block (DFFBlock) to learn the effective contextual features. Furthermore, we introduce an indirect sampling strategy in the test sampling process, which not only suppresses the accumulation of errors but also ensures the stability of the results. Extensive experiments on popular benchmarks validate the superior performance of the proposed method. The code is released in https://github.com/benchacha/DADM.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Curved Image Triangulation Based on Differentiable Rendering 基于可微分渲染的曲面图像三角测量

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15232

Wanyi Wang, Zhonggui Chen, Lincong Fang, Juan Cao

Image triangulation methods, which decompose an image into a series of triangles, are fundamental in artistic creation and image processing. This paper introduces a novel framework that integrates cubic Bézier curves into image triangulation, enabling the precise reconstruction of curved image features. Our developed framework constructs a well-structured curved triangle mesh, effectively preventing overlaps between curves. A refined energy function, grounded in differentiable rendering, establishes a direct link between mesh geometry and rendering effects and is instrumental in guiding the curved mesh generation. Additionally, we derive an explicit gradient formula with respect to mesh parameters, facilitating the adaptive and efficient optimization of these parameters to fully leverage the capabilities of cubic Bézier curves. Through experimental and comparative analyses with state-of-the-art methods, our approach demonstrates a significant enhancement in both numerical accuracy and visual quality.

将图像分解成一系列三角形的图像三角测量方法是艺术创作和图像处理的基础。本文介绍了一种新颖的框架，它将立方贝塞尔曲线整合到图像三角剖分中，从而能够精确地重建曲面图像特征。我们开发的框架能构建结构良好的曲面三角形网格，有效防止曲线之间的重叠。以可微分渲染为基础的精炼能量函数在网格几何和渲染效果之间建立了直接联系，并在指导曲线网格生成方面发挥了重要作用。此外，我们还推导出了一个与网格参数相关的显式梯度公式，便于对这些参数进行自适应的高效优化，从而充分发挥立方贝塞尔曲线的功能。通过实验和与最先进方法的对比分析，我们的方法在数值精度和视觉质量方面都有显著提升。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Computer Graphics Forum

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀