Computer Graphics Forum最新文献_第4页

Exploring Fast and Flexible Zero-Shot Low-Light Image/Video Enhancement 探索快速灵活的零镜头低照度图像/视频增强技术

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15210

Xianjun Han, Taoli Bao, Hongyu Yang

Low-light image/video enhancement is a challenging task when images or video are captured under harsh lighting conditions. Existing methods mostly formulate this task as an image-to-image conversion task via supervised or unsupervised learning. However, such conversion methods require an extremely large amount of data for training, whether paired or unpaired. In addition, these methods are restricted to specific training data, making it difficult for the trained model to enhance other types of images or video. In this paper, we explore a novel, fast and flexible, zero-shot, low-light image or video enhancement framework. Without relying on prior training or relationships among neighboring frames, we are committed to estimating the illumination of the input image/frame by a well-designed network. The proposed zero-shot, low-light image/video enhancement architecture includes illumination estimation and residual correction modules. The network architecture is very concise and does not require any paired or unpaired data during training, which allows low-light enhancement to be performed with several simple iterations. Despite its simplicity, we show that the method is fast and generalizes well to diverse lighting conditions. Many experiments on various images and videos qualitatively and quantitatively demonstrate the advantages of our method over state-of-the-art methods.

低照度图像/视频增强是一项具有挑战性的任务，因为图像或视频是在恶劣的照明条件下拍摄的。现有的方法大多通过有监督或无监督学习将这项任务表述为图像到图像的转换任务。然而，这类转换方法需要大量数据进行训练，无论是配对数据还是非配对数据。此外，这些方法仅限于特定的训练数据，使得训练好的模型难以增强其他类型的图像或视频。在本文中，我们探索了一种新颖、快速、灵活、零镜头、低照度图像或视频增强框架。在不依赖事先训练或相邻帧之间关系的情况下，我们致力于通过精心设计的网络来估计输入图像/帧的光照度。所提出的零镜头、低照度图像/视频增强架构包括照度估计和残差校正模块。该网络架构非常简洁，在训练过程中不需要任何配对或非配对数据，因此只需进行几次简单的迭代即可实现弱光增强。尽管方法简单，但我们发现该方法不仅速度快，而且能很好地适应各种照明条件。在各种图像和视频上进行的大量实验从定性和定量两方面证明了我们的方法比最先进的方法更具优势。

{"title":"Exploring Fast and Flexible Zero-Shot Low-Light Image/Video Enhancement","authors":"Xianjun Han, Taoli Bao, Hongyu Yang","doi":"10.1111/cgf.15210","DOIUrl":"https://doi.org/10.1111/cgf.15210","url":null,"abstract":"Low-light image/video enhancement is a challenging task when images or video are captured under harsh lighting conditions. Existing methods mostly formulate this task as an image-to-image conversion task via supervised or unsupervised learning. However, such conversion methods require an extremely large amount of data for training, whether paired or unpaired. In addition, these methods are restricted to specific training data, making it difficult for the trained model to enhance other types of images or video. In this paper, we explore a novel, fast and flexible, zero-shot, low-light image or video enhancement framework. Without relying on prior training or relationships among neighboring frames, we are committed to estimating the illumination of the input image/frame by a well-designed network. The proposed zero-shot, low-light image/video enhancement architecture includes illumination estimation and residual correction modules. The network architecture is very concise and does not require any paired or unpaired data during training, which allows low-light enhancement to be performed with several simple iterations. Despite its simplicity, we show that the method is fast and generalizes well to diverse lighting conditions. Many experiments on various images and videos qualitatively and quantitatively demonstrate the advantages of our method over state-of-the-art methods.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Hybrid Parametrization Method for B-Spline Curve Interpolation via Supervised Learning 通过监督学习实现 B-样条曲线插值的混合参数化方法

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15240

Tianyu Song, Tong Shen, Linlin Ge, Jieqing Feng

B-spline curve interpolation is a fundamental algorithm in computer-aided geometric design. Determining suitable parameters based on data points distribution has always been an important issue for high-quality interpolation curves generation. Various parameterization methods have been proposed. However, there is no universally satisfactory method that is applicable to data points with diverse distributions. In this work, a hybrid parametrization method is proposed to overcome the problem. For a given set of data points, a classifier via supervised learning identifies an optimal local parameterization method based on the local geometric distribution of four adjacent data points, and the optimal local parameters are computed using the selected optimal local parameterization method for the four adjacent data points. Then a merging method is employed to calculate global parameters which align closely with the local parameters. Experiments demonstrate that the proposed hybrid parameterization method well adapts the different distributions of data points statistically. The proposed method has a flexible and scalable framework, which can includes current and potential new parameterization methods as its components.

B 样条曲线插值是计算机辅助几何设计中的一种基本算法。根据数据点分布确定合适的参数一直是生成高质量插值曲线的重要问题。目前已提出了多种参数化方法。然而，目前还没有一种普遍满意的方法适用于具有不同分布的数据点。本研究提出了一种混合参数化方法来解决这一问题。对于一组给定的数据点，分类器通过监督学习，根据四个相邻数据点的局部几何分布确定一种最佳局部参数化方法，并利用所选的最佳局部参数化方法为四个相邻数据点计算最佳局部参数。然后采用合并方法计算出与局部参数密切配合的全局参数。实验证明，所提出的混合参数化方法能很好地在统计上适应不同的数据点分布。所提出的方法具有灵活和可扩展的框架，可以将当前和潜在的新参数化方法作为其组成部分。

{"title":"A Hybrid Parametrization Method for B-Spline Curve Interpolation via Supervised Learning","authors":"Tianyu Song, Tong Shen, Linlin Ge, Jieqing Feng","doi":"10.1111/cgf.15240","DOIUrl":"https://doi.org/10.1111/cgf.15240","url":null,"abstract":"B-spline curve interpolation is a fundamental algorithm in computer-aided geometric design. Determining suitable parameters based on data points distribution has always been an important issue for high-quality interpolation curves generation. Various parameterization methods have been proposed. However, there is no universally satisfactory method that is applicable to data points with diverse distributions. In this work, a hybrid parametrization method is proposed to overcome the problem. For a given set of data points, a classifier via supervised learning identifies an optimal local parameterization method based on the local geometric distribution of four adjacent data points, and the optimal local parameters are computed using the selected optimal local parameterization method for the four adjacent data points. Then a merging method is employed to calculate global parameters which align closely with the local parameters. Experiments demonstrate that the proposed hybrid parameterization method well adapts the different distributions of data points statistically. The proposed method has a flexible and scalable framework, which can includes current and potential new parameterization methods as its components.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GLTScene: Global-to-Local Transformers for Indoor Scene Synthesis with General Room Boundaries GLTScene：用于具有一般房间边界的室内场景合成的全局到局部变换器

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15236

Yijie Li, Pengfei Xu, Junquan Ren, Zefan Shao, Hui Huang

We present GLTScene, a novel data-driven method for high-quality furniture layout synthesis with general room boundaries as conditions. This task is challenging since the existing indoor scene datasets do not cover the variety of general room boundaries. We incorporate the interior design principles with learning techniques and adopt a global-to-local strategy for this task. Globally, we learn the placement of furniture objects from the datasets without considering their alignment. Locally, we learn the alignment of furniture objects relative to their nearest walls, according to the alignment principle in interior design. The global placement and local alignment of furniture objects are achieved by two transformers respectively. We compare our method with several baselines in the task of furniture layout synthesis with general room boundaries as conditions. Our method outperforms these baselines both quantitatively and qualitatively. We also demonstrate that our method can achieve other conditional layout synthesis tasks, including object-level conditional generation and attribute-level conditional generation. The code is publicly available at https://github.com/WWalter-Lee/GLTScene.

我们介绍的 GLTScene 是一种新型的数据驱动方法，用于以一般房间边界为条件合成高质量的家具布局。这项任务极具挑战性，因为现有的室内场景数据集无法涵盖各种一般房间边界。我们将室内设计原理与学习技术相结合，采用从全局到局部的策略来完成这项任务。从全局来看，我们从数据集中学习家具对象的摆放位置，而不考虑它们的对齐情况。在局部，我们根据室内设计中的对齐原则，学习家具对象相对于其最近墙壁的对齐方式。家具对象的全局摆放和局部对齐分别由两个变换器实现。在以一般房间边界为条件的家具布局合成任务中，我们将我们的方法与几种基准方法进行了比较。我们的方法在数量和质量上都优于这些基线方法。我们还证明了我们的方法可以实现其他条件布局合成任务，包括对象级条件生成和属性级条件生成。代码可在 https://github.com/WWalter-Lee/GLTScene 公开获取。

{"title":"GLTScene: Global-to-Local Transformers for Indoor Scene Synthesis with General Room Boundaries","authors":"Yijie Li, Pengfei Xu, Junquan Ren, Zefan Shao, Hui Huang","doi":"10.1111/cgf.15236","DOIUrl":"https://doi.org/10.1111/cgf.15236","url":null,"abstract":"We present GLTScene, a novel data-driven method for high-quality furniture layout synthesis with general room boundaries as conditions. This task is challenging since the existing indoor scene datasets do not cover the variety of general room boundaries. We incorporate the interior design principles with learning techniques and adopt a global-to-local strategy for this task. Globally, we learn the placement of furniture objects from the datasets without considering their alignment. Locally, we learn the alignment of furniture objects relative to their nearest walls, according to the alignment principle in interior design. The global placement and local alignment of furniture objects are achieved by two transformers respectively. We compare our method with several baselines in the task of furniture layout synthesis with general room boundaries as conditions. Our method outperforms these baselines both quantitatively and qualitatively. We also demonstrate that our method can achieve other conditional layout synthesis tasks, including object-level conditional generation and attribute-level conditional generation. The code is publicly available at https://github.com/WWalter-Lee/GLTScene.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CoupNeRF: Property-aware Neural Radiance Fields for Multi-Material Coupled Scenario Reconstruction CoupNeRF：用于多材料耦合场景重构的属性感知神经辐射场

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15208

Jin Li, Yang Gao, Wenfeng Song, Yacong Li, Shuai Li, Aimin Hao, Hong Qin

Neural Radiance Fields (NeRFs) have achieved significant recognition for their proficiency in scene reconstruction and rendering by utilizing neural networks to depict intricate volumetric environments. Despite considerable research dedicated to reconstructing physical scenes, rare works succeed in challenging scenarios involving dynamic, multi-material objects. To alleviate, we introduce CoupNeRF, an efficient neural network architecture that is aware of multiple material properties. This architecture combines physically grounded continuum mechanics with NeRF, facilitating the identification of motion systems across a wide range of physical coupling scenarios. We first reconstruct specific-material of objects within 3D physical fields to learn material parameters. Then, we develop a method to model the neighbouring particles, enhancing the learning process specifically in regions where material transitions occur. The effectiveness of CoupNeRF is demonstrated through extensive experiments, showcasing its proficiency in accurately coupling and identifying the behavior of complex physical scenes that span multiple physics domains.

神经辐射场（NeRF）利用神经网络描绘错综复杂的体积环境，在场景重建和渲染方面的能力得到了广泛认可。尽管有大量研究致力于物理场景的重建，但在涉及动态、多材料物体的挑战性场景中，鲜有成功之作。为了缓解这一问题，我们引入了 CoupNeRF，这是一种能感知多种材料属性的高效神经网络架构。该架构结合了以物理为基础的连续介质力学和 NeRF，有助于识别各种物理耦合场景中的运动系统。我们首先在三维物理场中重建物体的特定材料，以学习材料参数。然后，我们开发了一种对邻近粒子进行建模的方法，专门在发生材料转换的区域加强学习过程。我们通过大量实验证明了 CoupNeRF 的有效性，展示了它在精确耦合和识别跨越多个物理域的复杂物理场景行为方面的能力。

{"title":"CoupNeRF: Property-aware Neural Radiance Fields for Multi-Material Coupled Scenario Reconstruction","authors":"Jin Li, Yang Gao, Wenfeng Song, Yacong Li, Shuai Li, Aimin Hao, Hong Qin","doi":"10.1111/cgf.15208","DOIUrl":"https://doi.org/10.1111/cgf.15208","url":null,"abstract":"Neural Radiance Fields (NeRFs) have achieved significant recognition for their proficiency in scene reconstruction and rendering by utilizing neural networks to depict intricate volumetric environments. Despite considerable research dedicated to reconstructing physical scenes, rare works succeed in challenging scenarios involving dynamic, multi-material objects. To alleviate, we introduce CoupNeRF, an efficient neural network architecture that is aware of multiple material properties. This architecture combines physically grounded continuum mechanics with NeRF, facilitating the identification of motion systems across a wide range of physical coupling scenarios. We first reconstruct specific-material of objects within 3D physical fields to learn material parameters. Then, we develop a method to model the neighbouring particles, enhancing the learning process specifically in regions where material transitions occur. The effectiveness of CoupNeRF is demonstrated through extensive experiments, showcasing its proficiency in accurately coupling and identifying the behavior of complex physical scenes that span multiple physics domains.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DSGI-Net: Density-based Selective Grouping Point Cloud Learning Network for Indoor Scene DSGI-Net：用于室内场景的基于密度的选择性分组点云学习网络

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15218

Xin Wen, Yao Duan, Kai Xu, Chenyang Zhu

Indoor scene point clouds exhibit diverse distributions and varying levels of sparsity, characterized by more intricate geometry and occlusion compared to outdoor scenes or individual objects. Despite recent advancements in 3D point cloud analysis introducing various network architectures, there remains a lack of frameworks tailored to the unique attributes of indoor scenarios. To address this, we propose DSGI-Net, a novel indoor scene point cloud learning network that can be integrated into existing models. The key innovation of this work is selectively grouping more informative neighbor points in sparse regions and promoting semantic consistency of the local area where different instances are in proximity but belong to distinct categories. Furthermore, our method encodes both semantic and spatial relationships between points in local regions to reduce the loss of local geometric details. Extensive experiments on the ScanNetv2, SUN RGB-D, and S3DIS indoor scene benchmarks demonstrate that our method is straightforward yet effective.

与室外场景或单个物体相比，室内场景点云呈现出多样化的分布和不同程度的稀疏性，其特点是几何形状和遮挡更为复杂。尽管最近在三维点云分析中引入了各种网络架构，但仍然缺乏针对室内场景独特属性的框架。为了解决这个问题，我们提出了 DSGI-Net，一种可集成到现有模型中的新型室内场景点云学习网络。这项工作的关键创新点在于有选择性地将稀疏区域中信息量更大的邻近点分组，并在不同实例相邻但属于不同类别的局部区域促进语义一致性。此外，我们的方法还对局部区域中点之间的语义和空间关系进行了编码，以减少局部几何细节的损失。在 ScanNetv2、SUN RGB-D 和 S3DIS 室内场景基准上进行的大量实验表明，我们的方法简单而有效。

引用次数: 0

Evolutive 3D Urban Data Representation through Timeline Design Space 通过时间轴设计空间进化三维城市数据表示

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15237

C. Le Bihan Gautier, J. Delanoy, G. Gesquière

Cities are constantly changing to adapt to new societal and environmental challenges. Understanding their evolution is thus essential to make informed decisions about their future. To capture these changes, cities are increasingly offering digital 3D snapshots of their territory over time. However, existing tools to visualise these data typically represent the city at a specific point in time, limiting a comprehensive analysis of its evolution. In this paper, we propose a new method for simultaneously visualising different versions of the city in a 3D space. We integrate the different versions of the city along a new way of 3D timeline that can take different shapes depending on the needs of the user and the dataset being visualised. We propose four different shapes of timelines and three ways to place the versions along it. Our method places the versions such that there is no visual overlap for the user by varying the parameters of the timelines, and offer options to ease the understanding of the scene by changing the orientation or scale of the versions. We evaluate our method on different datasets to demonstrate the advantages and limitations of the different shapes of timeline and provide recommendations so as to which shape to chose.

城市在不断变化，以适应新的社会和环境挑战。因此，要对城市的未来做出明智的决策，了解城市的发展变化至关重要。为了捕捉这些变化，越来越多的城市开始提供其领土随时间变化的数字三维快照。然而，现有的可视化这些数据的工具通常代表城市在特定时间点的情况，从而限制了对其演变的全面分析。在本文中，我们提出了一种在三维空间中同时可视化不同版本城市的新方法。我们通过一种新的三维时间轴方式将不同版本的城市整合在一起，这种时间轴可以根据用户的需求和可视化数据集的不同而呈现出不同的形状。我们提出了四种不同形状的时间轴和三种沿时间轴放置版本的方法。我们的方法通过改变时间轴的参数来放置版本，从而避免用户视觉上的重叠，并通过改变版本的方向或比例来提供便于理解场景的选项。我们在不同的数据集上对我们的方法进行了评估，以展示不同形状时间线的优势和局限性，并就选择哪种形状的时间线提出建议。

{"title":"Evolutive 3D Urban Data Representation through Timeline Design Space","authors":"C. Le Bihan Gautier, J. Delanoy, G. Gesquière","doi":"10.1111/cgf.15237","DOIUrl":"https://doi.org/10.1111/cgf.15237","url":null,"abstract":"Cities are constantly changing to adapt to new societal and environmental challenges. Understanding their evolution is thus essential to make informed decisions about their future. To capture these changes, cities are increasingly offering digital 3D snapshots of their territory over time. However, existing tools to visualise these data typically represent the city at a specific point in time, limiting a comprehensive analysis of its evolution. In this paper, we propose a new method for simultaneously visualising different versions of the city in a 3D space. We integrate the different versions of the city along a new way of 3D timeline that can take different shapes depending on the needs of the user and the dataset being visualised. We propose four different shapes of timelines and three ways to place the versions along it. Our method places the versions such that there is no visual overlap for the user by varying the parameters of the timelines, and offer options to ease the understanding of the scene by changing the orientation or scale of the versions. We evaluate our method on different datasets to demonstrate the advantages and limitations of the different shapes of timeline and provide recommendations so as to which shape to chose.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Frequency-Aware Facial Image Shadow Removal through Skin Color and Texture Learning 通过皮肤颜色和纹理学习实现频率感知面部图像阴影去除

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15220

Ling Zhang, Wenyang Xie, Chunxia Xiao

Existing facial image shadow removal methods predominantly rely on pre-extracted facial features. However, these methods often fail to capitalize on the full potential of these features, resorting to simplified utilization. Furthermore, they tend to overlook the importance of low-frequency information during the extraction of prior features, which can be easily compromised by noises. In our work, we propose a frequency-aware shadow removal network (FSRNet) for facial image shadow removal, which utilizes the skin color and texture information in the face to help recover illumination in shadow regions. Our FSRNet uses a frequency-domain image decomposition network to extract the low-frequency skin color map and high-frequency texture map from the face images, and applies a color-texture guided shadow removal network to produce final shadow removal result. Concretely, the designed fourier sparse attention block (FSABlock) can transform images from the spatial domain to the frequency domain and help the network focus on the key information. We also introduce a skin color fusion module (CFModule) and a texture fusion module (TFModule) to enhance the understanding and utilization of color and texture features, promoting high-quality result without color distortion and detail blurring. Extensive experiments demonstrate the superiority of the proposed method. The code is available at https://github.com/laoxie521/FSRNet.

现有的面部图像阴影去除方法主要依赖于预先提取的面部特征。然而，这些方法往往不能充分发挥这些特征的潜力，而只是简单地加以利用。此外，这些方法在提取先验特征时往往忽略了低频信息的重要性，而低频信息很容易受到噪声的影响。在我们的工作中，我们提出了一种用于面部图像阴影去除的频率感知阴影去除网络（FSRNet），它利用面部的肤色和纹理信息来帮助恢复阴影区域的光照度。我们的 FSRNet 利用频域图像分解网络从人脸图像中提取低频肤色图和高频纹理图，并应用颜色-纹理引导的阴影去除网络来生成最终的阴影去除结果。具体来说，所设计的傅立叶稀疏关注块（FSABlock）可以将图像从空间域转换到频率域，帮助网络聚焦于关键信息。此外，我们还引入了肤色融合模块（CFModule）和纹理融合模块（TFModule），以加强对颜色和纹理特征的理解和利用，从而获得无色彩失真和细节模糊的高质量结果。大量实验证明了所提方法的优越性。代码见 https://github.com/laoxie521/FSRNet。

{"title":"Frequency-Aware Facial Image Shadow Removal through Skin Color and Texture Learning","authors":"Ling Zhang, Wenyang Xie, Chunxia Xiao","doi":"10.1111/cgf.15220","DOIUrl":"https://doi.org/10.1111/cgf.15220","url":null,"abstract":"Existing facial image shadow removal methods predominantly rely on pre-extracted facial features. However, these methods often fail to capitalize on the full potential of these features, resorting to simplified utilization. Furthermore, they tend to overlook the importance of low-frequency information during the extraction of prior features, which can be easily compromised by noises. In our work, we propose a frequency-aware shadow removal network (FSRNet) for facial image shadow removal, which utilizes the skin color and texture information in the face to help recover illumination in shadow regions. Our FSRNet uses a frequency-domain image decomposition network to extract the low-frequency skin color map and high-frequency texture map from the face images, and applies a color-texture guided shadow removal network to produce final shadow removal result. Concretely, the designed fourier sparse attention block (FSABlock) can transform images from the spatial domain to the frequency domain and help the network focus on the key information. We also introduce a skin color fusion module (CFModule) and a texture fusion module (TFModule) to enhance the understanding and utilization of color and texture features, promoting high-quality result without color distortion and detail blurring. Extensive experiments demonstrate the superiority of the proposed method. The code is available at https://github.com/laoxie521/FSRNet.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatially and Temporally Optimized Audio-Driven Talking Face Generation 时空优化的音频驱动人脸生成技术

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15228

Biao Dong, Bo-Yao Ma, Lei Zhang

Audio-driven talking face generation is essentially a cross-modal mapping from audio to video frames. The main challenge lies in the intricate one-to-many mapping, which affects lip sync accuracy. And the loss of facial details during image reconstruction often results in visual artifacts in the generated video. To overcome these challenges, this paper proposes to enhance the quality of generated talking faces with a new spatio-temporal consistency. Specifically, the temporal consistency is achieved through consecutive frames of the each phoneme, which form temporal modules that exhibit similar lip appearance changes. This allows for adaptive adjustment in the lip movement for accurate sync. The spatial consistency pertains to the uniform distribution of textures within local regions, which form spatial modules and regulate the texture distribution in the generator. This yields fine details in the reconstructed facial images. Extensive experiments show that our method can generate more natural talking faces than previous state-of-the-art methods in both accurate lip sync and realistic facial details.

音频驱动的人脸识别本质上是从音频到视频帧的跨模态映射。主要挑战在于复杂的一对多映射，这会影响唇语同步的准确性。而图像重建过程中面部细节的丢失往往会导致生成的视频出现视觉假象。为了克服这些挑战，本文提出用一种新的时空一致性来提高生成的会说话人脸的质量。具体来说，时空一致性是通过每个音素的连续帧来实现的，这些连续帧形成的时空模块表现出相似的唇部外观变化。这样就可以自适应地调整嘴唇的运动，实现准确的同步。空间一致性与局部区域内纹理的均匀分布有关，这些纹理形成空间模块，并调节生成器中的纹理分布。这就产生了重建面部图像中的精细细节。广泛的实验表明，我们的方法在准确的唇部同步和逼真的面部细节方面都能生成比以往最先进的方法更自然的说话人脸。

引用次数: 0

FastFlow: GPU Acceleration of Flow and Depression Routing for Landscape Simulation FastFlow：GPU 加速景观仿真中的流动和凹陷路由选择

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15243

Aryamaan Jain, Bernhard Kerbl, James Gain, Brandon Finley, Guillaume Cordonnier

Terrain analysis plays an important role in computer graphics, hydrology and geomorphology. In particular, analyzing the path of material flow over a terrain with consideration of local depressions is a precursor to many further tasks in erosion, river formation, and plant ecosystem simulation. For example, fluvial erosion simulation used in terrain modeling computes water discharge to repeatedly locate erosion channels for soil removal and transport. Despite its significance, traditional methods face performance constraints, limiting their broader applicability.

In this paper, we propose a novel GPU flow routing algorithm that computes the water discharge in 𝒪(log n) iterations for a terrain with n vertices (assuming n processors). We also provide a depression routing algorithm to route the water out of local minima formed by depressions in the terrain, which converges in 𝒪(log² n) iterations. Our implementation of these algorithms leads to a 5× speedup for flow routing and 34 × to 52 × speedup for depression routing compared to previous work on a 1024² terrain, enabling interactive control of terrain simulation.

地形分析在计算机制图、水文学和地貌学中发挥着重要作用。特别是，分析物质在地形上的流动路径并考虑局部洼地，是侵蚀、河流形成和植物生态系统模拟等许多后续工作的先导。例如，地形建模中使用的河道侵蚀模拟可以计算水的排放量，从而反复确定侵蚀通道的位置，以清除和运输土壤。在本文中，我们提出了一种新颖的 GPU 水流路由算法，该算法可在 n 个顶点的地形（假设有 n 个处理器）中以 𝒪(log n) 的迭代次数计算水的排放量。我们还提供了一种洼地路由算法，用于将水从地形洼地形成的局部极小值中路由出来，该算法在𝒪(log2 n) 次迭代中收敛。与之前在 10242 个地形上的研究相比，我们对这些算法的实现使水流路由速度提高了 5 倍，洼地路由速度提高了 34 倍到 52 倍，从而实现了对地形模拟的交互式控制。

{"title":"FastFlow: GPU Acceleration of Flow and Depression Routing for Landscape Simulation","authors":"Aryamaan Jain, Bernhard Kerbl, James Gain, Brandon Finley, Guillaume Cordonnier","doi":"10.1111/cgf.15243","DOIUrl":"https://doi.org/10.1111/cgf.15243","url":null,"abstract":"Terrain analysis plays an important role in computer graphics, hydrology and geomorphology. In particular, analyzing the path of material flow over a terrain with consideration of local depressions is a precursor to many further tasks in erosion, river formation, and plant ecosystem simulation. For example, fluvial erosion simulation used in terrain modeling computes water discharge to repeatedly locate erosion channels for soil removal and transport. Despite its significance, traditional methods face performance constraints, limiting their broader applicability.In this paper, we propose a novel GPU flow routing algorithm that computes the water discharge in 𝒪(log n) iterations for a terrain with n vertices (assuming n processors). We also provide a depression routing algorithm to route the water out of local minima formed by depressions in the terrain, which converges in 𝒪(log2 n) iterations. Our implementation of these algorithms leads to a 5× speedup for flow routing and 34 × to 52 × speedup for depression routing compared to previous work on a 10242 terrain, enabling interactive control of terrain simulation.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Point-AGM : Attention Guided Masked Auto-Encoder for Joint Self-supervised Learning on Point Clouds 点-AGM：用于点云联合自监督学习的注意力引导掩码自动编码器

IF 2.7 4区计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Graphics Forum

Pub Date : 2024-10-24 DOI: 10.1111/cgf.15219

Jie Liu, Mengna Yang, Yu Tian, Yancui Li, Da Song, Kang Li, Xin Cao

Masked point modeling (MPM) has gained considerable attention in self-supervised learning for 3D point clouds. While existing self-supervised methods have progressed in learning from point clouds, we aim to address their limitation of capturing high-level semantics through our novel attention-guided masking framework, Point-AGM. Our approach introduces an attention-guided masking mechanism that selectively masks low-attended regions, enabling the model to concentrate on reconstructing more critical areas and addressing the limitations of random and block masking strategies. Furthermore, we exploit the inherent advantages of the teacher-student network to enable cross-view contrastive learning on augmented dual-view point clouds, enforcing consistency between complete and partially masked views of the same 3D shape in the feature space. This unified framework leverages the complementary strengths of masked point modeling, attention-guided masking, and contrastive learning for robust representation learning. Extensive experiments have shown the effectiveness of our approach and its well-transferable performance across various downstream tasks. Specifically, our model achieves an accuracy of 94.12% on ModelNet40 and 87.16% on the PB-T50-RS setting of ScanObjectNN, outperforming other self-supervised learning methods.

掩蔽点建模（MPM）在三维点云的自监督学习中获得了相当大的关注。虽然现有的自监督方法在点云学习方面取得了进展，但我们的目标是通过我们新颖的注意力引导屏蔽框架 Point-AGM 解决它们在捕捉高层语义方面的局限性。我们的方法引入了一种注意力引导遮挡机制，可选择性地遮挡低注意力区域，使模型能够集中重建更关键的区域，并解决随机和块状遮挡策略的局限性。此外，我们还利用师生网络的固有优势，在增强的双视角点云上进行跨视角对比学习，确保特征空间中同一三维形状的完整视角和部分遮挡视角之间的一致性。这一统一框架充分利用了遮蔽点建模、注意力引导遮蔽和对比学习的互补优势，实现了稳健的表征学习。广泛的实验证明了我们方法的有效性及其在各种下游任务中的良好转换性能。具体来说，我们的模型在 ModelNet40 上达到了 94.12% 的准确率，在 ScanObjectNN 的 PB-T50-RS 设置上达到了 87.16%，优于其他自监督学习方法。

{"title":"Point-AGM : Attention Guided Masked Auto-Encoder for Joint Self-supervised Learning on Point Clouds","authors":"Jie Liu, Mengna Yang, Yu Tian, Yancui Li, Da Song, Kang Li, Xin Cao","doi":"10.1111/cgf.15219","DOIUrl":"https://doi.org/10.1111/cgf.15219","url":null,"abstract":"Masked point modeling (MPM) has gained considerable attention in self-supervised learning for 3D point clouds. While existing self-supervised methods have progressed in learning from point clouds, we aim to address their limitation of capturing high-level semantics through our novel attention-guided masking framework, Point-AGM. Our approach introduces an attention-guided masking mechanism that selectively masks low-attended regions, enabling the model to concentrate on reconstructing more critical areas and addressing the limitations of random and block masking strategies. Furthermore, we exploit the inherent advantages of the teacher-student network to enable cross-view contrastive learning on augmented dual-view point clouds, enforcing consistency between complete and partially masked views of the same 3D shape in the feature space. This unified framework leverages the complementary strengths of masked point modeling, attention-guided masking, and contrastive learning for robust representation learning. Extensive experiments have shown the effectiveness of our approach and its well-transferable performance across various downstream tasks. Specifically, our model achieves an accuracy of 94.12% on ModelNet40 and 87.16% on the PB-T50-RS setting of ScanObjectNN, outperforming other self-supervised learning methods.","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"43 7","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142665177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0