首页 > 最新文献

Graphical Models最新文献

英文 中文
Modeling multi-style portrait relief from a single photograph 从一张照片建模多风格人像浮雕
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-12-01 Epub Date: 2023-11-28 DOI: 10.1016/j.gmod.2023.101210
Yu-Wei Zhang , Hongguang Yang , Ping Luo , Zhi Li , Hui Liu , Zhongping Ji , Caiming Zhang

This paper aims at extending the method of Zhang et al. (2023) to produce not only portrait bas-reliefs from single photographs, but also high-depth reliefs with reasonable depth ordering. We cast this task as a problem of style-aware photo-to-depth translation, where the input is a photograph conditioned by a style vector and the output is a portrait relief with desired depth style. To construct ground-truth data for network training, we first propose an optimization-based method to synthesize high-depth reliefs from 3D portraits. Then, we train a normal-to-depth network to learn the mapping from normal maps to relief depths. After that, we use the trained network to generate high-depth relief samples using the provided normal maps from Zhang et al. (2023). As each normal map has pixel-wise photograph, we are able to establish correspondences between photographs and high-depth reliefs. By taking the bas-reliefs of Zhang et al. (2023), the new high-depth reliefs and their mixtures as target ground-truths, we finally train a encoder-to-decoder network to achieve style-aware relief modeling. Specially, the network is based on a U-shaped architecture, consisting of Swin Transformer blocks to process hierarchical deep features. Extensive experiments have demonstrated the effectiveness of the proposed method. Comparisons with previous works have verified its flexibility and state-of-the-art performance.

本文旨在扩展Zhang et al.(2023)的方法,不仅可以从单张照片中生成人像浅浮雕,还可以生成深度排序合理的高深度浮雕。我们将此任务作为样式感知照片到深度转换的问题,其中输入是由样式向量限定的照片,输出是具有所需深度样式的人像浮雕。为了构建用于网络训练的真地数据,我们首先提出了一种基于优化的方法,从三维人像中合成高深度浮雕。然后,我们训练一个法线到深度的网络来学习从法线到地形深度的映射。之后,我们使用训练好的网络,使用Zhang等人(2023)提供的法线图生成高深度地形样本。由于每个法线贴图都有像素级的照片,我们能够在照片和高深度浮雕之间建立对应关系。通过将Zhang等人(2023)的浅浮雕、新的高深度浮雕及其混合物作为目标ground-truth,我们最终训练了一个编码器到解码器网络,以实现风格感知浮雕建模。特别地,该网络基于由Swin Transformer块组成的u型结构来处理分层深度特征。大量的实验证明了该方法的有效性。与以前的作品比较,验证了它的灵活性和最先进的性能。
{"title":"Modeling multi-style portrait relief from a single photograph","authors":"Yu-Wei Zhang ,&nbsp;Hongguang Yang ,&nbsp;Ping Luo ,&nbsp;Zhi Li ,&nbsp;Hui Liu ,&nbsp;Zhongping Ji ,&nbsp;Caiming Zhang","doi":"10.1016/j.gmod.2023.101210","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101210","url":null,"abstract":"<div><p>This paper aims at extending the method of Zhang et al. (2023) to produce not only portrait bas-reliefs from single photographs, but also high-depth reliefs with reasonable depth ordering. We cast this task as a problem of style-aware photo-to-depth translation, where the input is a photograph conditioned by a style vector and the output is a portrait relief with desired depth style. To construct ground-truth data for network training, we first propose an optimization-based method to synthesize high-depth reliefs from 3D portraits. Then, we train a normal-to-depth network to learn the mapping from normal maps to relief depths. After that, we use the trained network to generate high-depth relief samples using the provided normal maps from Zhang et al. (2023). As each normal map has pixel-wise photograph, we are able to establish correspondences between photographs and high-depth reliefs. By taking the bas-reliefs of Zhang et al. (2023), the new high-depth reliefs and their mixtures as target ground-truths, we finally train a encoder-to-decoder network to achieve style-aware relief modeling. Specially, the network is based on a U-shaped architecture, consisting of Swin Transformer blocks to process hierarchical deep features. Extensive experiments have demonstrated the effectiveness of the proposed method. Comparisons with previous works have verified its flexibility and state-of-the-art performance.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"130 ","pages":"Article 101210"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1524070323000401/pdfft?md5=de53c7cacd318b65effd57ea40c70f18&pid=1-s2.0-S1524070323000401-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138454034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A systematic approach for enhancement of homogeneous background images using structural information 利用结构信息增强均匀背景图像的系统方法
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-12-01 Epub Date: 2023-10-25 DOI: 10.1016/j.gmod.2023.101206
D. Vijayalakshmi , Malaya Kumar Nath

Image enhancement is an indispensable pre-processing step for several image processing applications. Mainly, histogram equalization is one of the widespread techniques used by various researchers to improve the image quality by expanding the pixel values to fill the entire dynamic grayscale. It results in the visual artifact, structural information loss near edges due to the information loss (due to many-to-one mapping), and alteration in average luminance to a higher value. This paper proposes an enhancement algorithm based on structural information for homogeneous background images. The intensities are divided into two segments using the median value to preserve the average luminance. Unlike traditional techniques, this algorithm incorporates the spatial locations in the equalization process instead of the number of intensity values occurrences. The occurrences of each intensity concerning their spatial locations are combined using Rènyi entropy to enumerate a discrete function. An adaptive clipping limit is applied to the discrete function to control the enhancement rate. Then histogram equalization is performed on each segment separately, and the equalized segments are integrated to produce an enhanced image. The algorithm’s effectiveness is validated by evaluating the proposed method on CEED, CSIQ, LOL, and TID2013 databases. Experimental results reveal that the proposed method improves the contrast while preserving structural information, detail information, and average luminance. They are quantified by the high value of contrast improvement index, structural similarity index, and discrete entropy, and low value of average mean brightness error values of the proposed method when compared with the methods available in the literature, including deep learning architectures.

在许多图像处理应用中,图像增强是必不可少的预处理步骤。直方图均衡化(histogram equalization)是众多研究者广泛使用的技术之一,通过扩展像素值来填充整个动态灰度,从而提高图像质量。它会导致视觉伪影,由于信息丢失(由于多对一映射)而导致的边缘附近的结构信息丢失,以及平均亮度变化到更高的值。提出了一种基于结构信息的均匀背景图像增强算法。使用中值将亮度分成两段,以保持平均亮度。与传统技术不同,该算法在均衡过程中包含空间位置,而不是强度值出现的次数。每个强度与其空间位置相关的出现使用rnyi熵组合以枚举离散函数。对离散函数采用自适应裁剪限制来控制增强率。然后对每一段分别进行直方图均衡化,并将均衡化后的片段进行综合,得到增强图像。通过对CEED、CSIQ、LOL和TID2013数据库的评估,验证了该算法的有效性。实验结果表明,该方法在保留结构信息、细节信息和平均亮度的同时,提高了对比度。与文献中已有的方法(包括深度学习架构)相比,本文方法的对比度改进指数、结构相似性指数和离散熵值较高,平均亮度误差值较低。
{"title":"A systematic approach for enhancement of homogeneous background images using structural information","authors":"D. Vijayalakshmi ,&nbsp;Malaya Kumar Nath","doi":"10.1016/j.gmod.2023.101206","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101206","url":null,"abstract":"<div><p>Image enhancement is an indispensable pre-processing step for several image processing applications. Mainly, histogram equalization is one of the widespread techniques used by various researchers to improve the image quality by expanding the pixel values to fill the entire dynamic grayscale. It results in the visual artifact, structural information loss near edges due to the information loss (due to many-to-one mapping), and alteration in average luminance to a higher value. This paper proposes an enhancement algorithm based on structural information for homogeneous background images. The intensities are divided into two segments using the median value to preserve the average luminance. Unlike traditional techniques, this algorithm incorporates the spatial locations in the equalization process instead of the number of intensity values occurrences. The occurrences of each intensity concerning their spatial locations are combined using Rènyi entropy to enumerate a discrete function. An adaptive clipping limit is applied to the discrete function to control the enhancement rate. Then histogram equalization is performed on each segment separately, and the equalized segments are integrated to produce an enhanced image. The algorithm’s effectiveness is validated by evaluating the proposed method on CEED, CSIQ, LOL, and TID2013 databases. Experimental results reveal that the proposed method improves the contrast while preserving structural information, detail information, and average luminance. They are quantified by the high value of contrast improvement index, structural similarity index, and discrete entropy, and low value of average mean brightness error values of the proposed method when compared with the methods available in the literature, including deep learning architectures.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"130 ","pages":"Article 101206"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S152407032300036X/pdfft?md5=66c749d2624c0d77acd46a4f2037626a&pid=1-s2.0-S152407032300036X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"92047095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-performance Ellipsoidal Clipmaps 高性能椭球体剪贴图
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-12-01 Epub Date: 2023-11-30 DOI: 10.1016/j.gmod.2023.101209
Aleksandar Dimitrijević, Dejan Rančić

This paper presents performance improvements for Ellipsoid Clipmaps, an out-of-core planet-sized geodetically accurate terrain rendering algorithm. The performance improvements were achieved by eliminating unnecessarily dense levels, more accurate block culling in the geographic coordinate system, and more efficient rendering methods. The elimination of unnecessarily dense levels is the result of analyzing and determining the optimal relative height of the viewer with respect to the most detailed level, resulting in the most consistent size of triangles across all visible levels. The proposed method for estimating the visibility of blocks based on view orientation allows rapid block-level view frustum culling performed in data space before visualization and spatial transformation of blocks. The use of a modern geometry pipeline through task and mesh shaders forced the handling of extremely fine granularity of blocks, but also shifted a significant part of the block culling process from CPU to the GPU. The approach described achieves high throughput and enables geodetically accurate rendering of the terrain based on the WGS 84 reference ellipsoid at very high resolution and in real time, with tens of millions of triangles with an average area of about 0.5 pix2 on a 1080p screen on mid-range graphics cards.

本文介绍了椭球Clipmaps的性能改进,椭球Clipmaps是一种核外行星大小的大地测量精确地形绘制算法。性能改进是通过消除不必要的密集级别、在地理坐标系中更精确的块剔除和更有效的渲染方法来实现的。消除不必要的密集关卡是分析和确定观看者相对于最详细关卡的最佳相对高度的结果,从而在所有可见关卡中产生最一致的三角形大小。提出的基于视图方向的块可见性估计方法允许在块可视化和空间转换之前在数据空间中快速进行块级视图截锥体剔除。通过任务和网格着色器使用现代几何管道强制处理极细粒度的块,但也将块剔除过程的重要部分从CPU转移到GPU。所描述的方法实现了高吞吐量,并能够基于WGS 84参考椭球体以非常高的分辨率和实时的大地测量精度渲染地形,在中程显卡的1080p屏幕上具有数千万个平均面积约为0.5 pix2的三角形。
{"title":"High-performance Ellipsoidal Clipmaps","authors":"Aleksandar Dimitrijević,&nbsp;Dejan Rančić","doi":"10.1016/j.gmod.2023.101209","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101209","url":null,"abstract":"<div><p>This paper presents performance improvements for Ellipsoid Clipmaps, an out-of-core planet-sized geodetically accurate terrain rendering algorithm. The performance improvements were achieved by eliminating unnecessarily dense levels, more accurate block culling in the geographic coordinate system, and more efficient rendering methods. The elimination of unnecessarily dense levels is the result of analyzing and determining the optimal relative height of the viewer with respect to the most detailed level, resulting in the most consistent size of triangles across all visible levels. The proposed method for estimating the visibility of blocks based on view orientation allows rapid block-level view frustum culling performed in data space before visualization and spatial transformation of blocks. The use of a modern geometry pipeline through task and mesh shaders forced the handling of extremely fine granularity of blocks, but also shifted a significant part of the block culling process from CPU to the GPU. The approach described achieves high throughput and enables geodetically accurate rendering of the terrain based on the WGS 84 reference ellipsoid at very high resolution and in real time, with tens of millions of triangles with an average area of about 0.5 pix<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span> on a 1080p screen on mid-range graphics cards.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"130 ","pages":"Article 101209"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1524070323000395/pdfft?md5=26122c390b83d408f64d205c80bb4675&pid=1-s2.0-S1524070323000395-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138466486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PU-GAT: Point cloud upsampling with graph attention network PU-GAT:点云上采样与图关注网络
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-12-01 Epub Date: 2023-09-25 DOI: 10.1016/j.gmod.2023.101201
Xuan Deng, Cheng Zhang, Jian Shi, Zizhao Wu

Point cloud upsampling has been extensively studied, however, the existing approaches suffer from the losing of structural information due to neglect of spatial dependencies between points. In this work, we propose PU-GAT, a novel 3D point cloud upsampling method that leverages graph attention networks to learn structural information over the baselines. Specifically, we first design a local–global feature extraction unit by combining spatial information and position encoding to mine the local spatial inter-dependencies across point features. Then, we construct an up-down-up feature expansion unit, which uses graph attention and GCN to enhance the ability of capturing local structure information. Extensive experiments on synthetic and real data have shown that our method achieves superior performance against previous methods quantitatively and qualitatively.

点云上采样已经得到了广泛的研究,然而,现有的方法由于忽略了点之间的空间依赖关系而导致结构信息的丢失。在这项工作中,我们提出了一种新颖的三维点云上采样方法PU-GAT,它利用图注意力网络来学习基线上的结构信息。具体来说,我们首先设计了一个局部-全局特征提取单元,结合空间信息和位置编码来挖掘点特征之间的局部空间相互依赖关系。然后,构造了一个上下向上的特征扩展单元,利用图注意和GCN增强了局部结构信息的捕获能力。大量的合成数据和实际数据实验表明,该方法在定量和定性上都优于以往的方法。
{"title":"PU-GAT: Point cloud upsampling with graph attention network","authors":"Xuan Deng,&nbsp;Cheng Zhang,&nbsp;Jian Shi,&nbsp;Zizhao Wu","doi":"10.1016/j.gmod.2023.101201","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101201","url":null,"abstract":"<div><p>Point cloud upsampling has been extensively studied, however, the existing approaches suffer from the losing of structural information due to neglect of spatial dependencies between points. In this work, we propose PU-GAT, a novel 3D point cloud upsampling method that leverages graph attention networks to learn structural information over the baselines. Specifically, we first design a local–global feature extraction unit by combining spatial information and position encoding to mine the local spatial inter-dependencies across point features. Then, we construct an up-down-up feature expansion unit, which uses graph attention and GCN to enhance the ability of capturing local structure information. Extensive experiments on synthetic and real data have shown that our method achieves superior performance against previous methods quantitatively and qualitatively.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"130 ","pages":"Article 101201"},"PeriodicalIF":1.7,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49889743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast progressive polygonal approximations for online strokes 快速渐进多边形近似在线笔画
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-10-01 Epub Date: 2023-09-04 DOI: 10.1016/j.gmod.2023.101200
Mohammad Tanvir Parvez

This paper presents a fast and progressive polygonal approximation algorithm for online strokes. A stroke is defined as a sequence of points between a pen-down and a pen-up. The proposed method generates polygonal approximations progressively as the user inputs the stroke. The proposed algorithm is suitable for real time shape modeling and retrieval. The number of operations used in the proposed algorithm is bounded by O(n), where n is the number of points in a stroke. Detailed experimental results show that the proposed method is not only fast, but also accurate enough compared to other reported algorithms.

本文提出了一种快速渐进的在线笔画多边形逼近算法。笔画被定义为笔画和笔画之间的一系列点。该方法在用户输入笔画时逐步生成多边形近似值。该算法适用于实时形状建模和检索。所提出的算法中使用的操作数量以O(n)为限,其中n是笔画中的点的数量。详细的实验结果表明,与已有的算法相比,该方法不仅速度快,而且精度高。
{"title":"Fast progressive polygonal approximations for online strokes","authors":"Mohammad Tanvir Parvez","doi":"10.1016/j.gmod.2023.101200","DOIUrl":"10.1016/j.gmod.2023.101200","url":null,"abstract":"<div><p>This paper presents a fast and progressive polygonal approximation algorithm for online strokes. A stroke is defined as a sequence of points between a pen-down and a pen-up. The proposed method generates polygonal approximations progressively as the user inputs the stroke. The proposed algorithm is suitable for real time shape modeling and retrieval. The number of operations used in the proposed algorithm is bounded by O(<em>n</em>), where <em>n</em> is the number of points in a stroke. Detailed experimental results show that the proposed method is not only fast, but also accurate enough compared to other reported algorithms.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"129 ","pages":"Article 101200"},"PeriodicalIF":1.7,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43203570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MixNet: Mix different networks for learning 3D implicit representations MixNet:混合不同的网络来学习3D隐式表示
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-10-01 Epub Date: 2023-07-25 DOI: 10.1016/j.gmod.2023.101190
Bowen Lyu , Li-Yong Shen , Chun-Ming Yuan

We introduce a neural network, MixNet, for learning implicit representations of 3D subtle models with large smooth areas and exact shape details in the form of interpolation of two different implicit functions. Our network takes a point cloud as input and uses conventional MLP networks and SIREN networks to predict different implicit fields. We use a learnable interpolation function to combine the implicit values of these two networks and achieve the respective advantages of them. The network is self-supervised with only reconstruction loss, leading to faithful 3D reconstructions with smooth planes, correct details, and plausible spatial partition without any ground-truth segmentation. We evaluate our method on ABC, the largest and most diverse CAD dataset, and some typical shapes to test in terms of geometric correctness and surface smoothness to demonstrate superiority over current alternatives suitable for shape reconstruction.

我们引入了一个神经网络MixNet,用于以两种不同隐式函数的插值形式学习具有大光滑区域和精确形状细节的3D精细模型的隐式表示。我们的网络以一个点云作为输入,使用传统的MLP网络和SIREN网络来预测不同的隐式域。我们使用一个可学习的插值函数将这两种网络的隐式值结合起来,实现它们各自的优势。该网络是自监督的,只有重建损失,导致忠实的三维重建,具有平滑的平面,正确的细节和合理的空间划分,没有任何地面真值分割。我们在ABC(最大和最多样化的CAD数据集)上评估了我们的方法,并在几何正确性和表面平滑度方面测试了一些典型的形状,以证明优于当前适合形状重建的替代方案。
{"title":"MixNet: Mix different networks for learning 3D implicit representations","authors":"Bowen Lyu ,&nbsp;Li-Yong Shen ,&nbsp;Chun-Ming Yuan","doi":"10.1016/j.gmod.2023.101190","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101190","url":null,"abstract":"<div><p>We introduce a neural network, MixNet, for learning implicit representations of 3D subtle models with large smooth areas and exact shape details in the form of interpolation of two different implicit functions. Our network takes a point cloud as input and uses conventional MLP networks and SIREN networks to predict different implicit fields. We use a learnable interpolation function to combine the implicit values of these two networks and achieve the respective advantages of them. The network is self-supervised with only reconstruction loss, leading to faithful 3D reconstructions with smooth planes, correct details, and plausible spatial partition without any ground-truth segmentation. We evaluate our method on ABC, the largest and most diverse CAD dataset, and some typical shapes to test in terms of geometric correctness and surface smoothness to demonstrate superiority over current alternatives suitable for shape reconstruction.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"129 ","pages":"Article 101190"},"PeriodicalIF":1.7,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49890152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RFMNet: Robust Deep Functional Maps for unsupervised non-rigid shape correspondence RFMNet:无监督非刚性形状对应的鲁棒深度函数映射
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-10-01 Epub Date: 2023-07-29 DOI: 10.1016/j.gmod.2023.101189
Ling Hu , Qinsong Li , Shengjun Liu , Dong-Ming Yan , Haojun Xu , Xinru Liu

In traditional deep functional maps for non-rigid shape correspondence, estimating a functional map including high-frequency information requires enough linearly independent features via the least square method, which is prone to be violated in practice, especially at an early stage of training, or costly post-processing, e.g. ZoomOut. In this paper, we propose a novel method called RFMNet (Robust Deep Functional Map Networks), which jointly considers training stability and more geometric shape features than previous works. We directly first produce a pointwise map by resorting to optimal transport and then convert it to an initial functional map. Such a mechanism mitigates the requirements for the descriptor and avoids the training instabilities resulting from the least square solver. Benefitting from the novel strategy, we successfully integrate a state-of-the-art geometric regularization for further optimizing the functional map, which substantially filters the initial functional map. We show our novel computing functional map module brings more stable training even under encoding the functional map with high-frequency information and faster convergence speed. Considering the pointwise and functional maps, an unsupervised loss is presented for penalizing the correspondence distortion of Delta functions between shapes. To catch discretization-resistant and orientation-aware shape features with our network, we utilize DiffusionNet as a feature extractor. Experimental results demonstrate our apparent superiority in correspondence quality and generalization across various shape discretizations and different datasets compared to the state-of-the-art learning methods.

在用于非刚性形状对应的传统深度函数图中,通过最小二乘法估计包括高频信息的函数图需要足够的线性独立特征,这在实践中很容易被违反,尤其是在训练的早期阶段,或者在昂贵的后处理(如ZoomOut)时。在本文中,我们提出了一种称为RFMNet(鲁棒深度函数映射网络)的新方法,该方法比以前的工作联合考虑了训练稳定性和更多的几何形状特征。我们首先通过采用最优传输直接生成逐点映射,然后将其转换为初始函数映射。这种机制减轻了对描述符的要求,并避免了由最小二乘解算器引起的训练不稳定性。得益于新策略,我们成功地集成了最先进的几何正则化来进一步优化函数图,该函数图对初始函数图进行了实质性滤波。我们展示了我们新颖的计算函数图模块,即使在对具有高频信息的函数图进行编码的情况下,也能带来更稳定的训练和更快的收敛速度。考虑到逐点映射和函数映射,提出了一种无监督损失来惩罚形状之间德尔塔函数的对应失真。为了用我们的网络捕捉抗离散化和方向感知的形状特征,我们使用DiffusionNet作为特征提取器。实验结果表明,与最先进的学习方法相比,我们在各种形状离散化和不同数据集的对应质量和泛化方面具有明显的优势。
{"title":"RFMNet: Robust Deep Functional Maps for unsupervised non-rigid shape correspondence","authors":"Ling Hu ,&nbsp;Qinsong Li ,&nbsp;Shengjun Liu ,&nbsp;Dong-Ming Yan ,&nbsp;Haojun Xu ,&nbsp;Xinru Liu","doi":"10.1016/j.gmod.2023.101189","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101189","url":null,"abstract":"<div><p>In traditional deep functional maps for non-rigid shape correspondence, estimating a functional map including high-frequency information requires enough linearly independent features via the least square method, which is prone to be violated in practice, especially at an early stage of training, or costly post-processing, e.g. ZoomOut. In this paper, we propose a novel method called RFMNet (<strong>R</strong>obust Deep <strong>F</strong>unctional <strong>M</strong>ap <strong>Net</strong>works), which jointly considers training stability and more geometric shape features than previous works. We directly first produce a pointwise map by resorting to optimal transport and then convert it to an initial functional map. Such a mechanism mitigates the requirements for the descriptor and avoids the training instabilities resulting from the least square solver. Benefitting from the novel strategy, we successfully integrate a state-of-the-art geometric regularization for further optimizing the functional map, which substantially filters the initial functional map. We show our novel computing functional map module brings more stable training even under encoding the functional map with high-frequency information and faster convergence speed. Considering the pointwise and functional maps, an unsupervised loss is presented for penalizing the correspondence distortion of Delta functions between shapes. To catch discretization-resistant and orientation-aware shape features with our network, we utilize DiffusionNet as a feature extractor. Experimental results demonstrate our apparent superiority in correspondence quality and generalization across various shape discretizations and different datasets compared to the state-of-the-art learning methods.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"129 ","pages":"Article 101189"},"PeriodicalIF":1.7,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49889736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint data and feature augmentation for self-supervised representation learning on point clouds 点云上自监督表示学习的联合数据和特征增强
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-10-01 Epub Date: 2023-07-28 DOI: 10.1016/j.gmod.2023.101188
Zhuheng Lu , Yuewei Dai , Weiqing Li , Zhiyong Su

To deal with the exhausting annotations, self-supervised representation learning from unlabeled point clouds has drawn much attention, especially centered on augmentation-based contrastive methods. However, specific augmentations hardly produce sufficient transferability to high-level tasks on different datasets. Besides, augmentations on point clouds may also change underlying semantics. To address the issues, we propose a simple but efficient augmentation fusion contrastive learning framework to combine data augmentations in Euclidean space and feature augmentations in feature space. In particular, we propose a data augmentation method based on sampling and graph generation. Meanwhile, we design a data augmentation network to enable a correspondence of representations by maximizing consistency between augmented graph pairs. We further design a feature augmentation network that encourages the model to learn representations invariant to the perturbations using an encoder perturbation. We comprehensively conduct extensive object classification experiments and object part segmentation experiments to validate the transferability of the proposed framework. Experimental results demonstrate that the proposed framework is effective to learn the point cloud representation in a self-supervised manner, and yields state-of-the-art results in the community. The source code is publicly available at: https://github.com/VCG-NJUST/AFSRL.

为了解决标注耗尽的问题,从未标记的点云中进行自监督表示学习已经引起了人们的广泛关注,特别是基于增强的对比方法。然而,特定的增强很难产生足够的可移植性到不同数据集上的高级任务。此外,对点云的增强也可能改变底层语义。为了解决这些问题,我们提出了一种简单而有效的增强融合对比学习框架,将欧几里德空间中的数据增强和特征空间中的特征增强相结合。我们特别提出了一种基于采样和图生成的数据增强方法。同时,我们设计了一个数据增强网络,通过最大化增广图对之间的一致性来实现表示的对应。我们进一步设计了一个特征增强网络,鼓励模型学习对使用编码器扰动的扰动不变的表示。我们进行了广泛的目标分类实验和目标部分分割实验,以验证所提出框架的可移植性。实验结果表明,所提出的框架能够有效地以自监督的方式学习点云表示,并在社区中产生最先进的结果。源代码可以在:https://github.com/VCG-NJUST/AFSRL上公开获得。
{"title":"Joint data and feature augmentation for self-supervised representation learning on point clouds","authors":"Zhuheng Lu ,&nbsp;Yuewei Dai ,&nbsp;Weiqing Li ,&nbsp;Zhiyong Su","doi":"10.1016/j.gmod.2023.101188","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101188","url":null,"abstract":"<div><p>To deal with the exhausting annotations, self-supervised representation learning from unlabeled point clouds has drawn much attention, especially centered on augmentation-based contrastive methods. However, specific augmentations hardly produce sufficient transferability to high-level tasks on different datasets. Besides, augmentations on point clouds may also change underlying semantics. To address the issues, we propose a simple but efficient augmentation fusion contrastive learning framework to combine data augmentations in Euclidean space and feature augmentations in feature space. In particular, we propose a data augmentation method based on sampling and graph generation. Meanwhile, we design a data augmentation network to enable a correspondence of representations by maximizing consistency between augmented graph pairs. We further design a feature augmentation network that encourages the model to learn representations invariant to the perturbations using an encoder perturbation. We comprehensively conduct extensive object classification experiments and object part segmentation experiments to validate the transferability of the proposed framework. Experimental results demonstrate that the proposed framework is effective to learn the point cloud representation in a self-supervised manner, and yields state-of-the-art results in the community. The source code is publicly available at: <span>https://github.com/VCG-NJUST/AFSRL</span><svg><path></path></svg>.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"129 ","pages":"Article 101188"},"PeriodicalIF":1.7,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49889741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Unsupervised learning of style-aware facial animation from real acting performances 从真实表演中无监督地学习风格感知面部动画
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-10-01 Epub Date: 2023-09-08 DOI: 10.1016/j.gmod.2023.101199
Wolfgang Paier , Anna Hilsmann , Peter Eisert

This paper presents a novel approach for text/speech-driven animation of a photo-realistic head model based on blend-shape geometry, dynamic textures, and neural rendering. Training a VAE for geometry and texture yields a parametric model for accurate capturing and realistic synthesis of facial expressions from a latent feature vector. Our animation method is based on a conditional CNN that transforms text or speech into a sequence of animation parameters. In contrast to previous approaches, our animation model learns disentangling/synthesizing different acting-styles in an unsupervised manner, requiring only phonetic labels that describe the content of training sequences. For realistic real-time rendering, we train a U-Net that refines rasterization-based renderings by computing improved pixel colors and a foreground matte. We compare our framework qualitatively/quantitatively against recent methods for head modeling as well as facial animation and evaluate the perceived rendering/animation quality in a user-study, which indicates large improvements compared to state-of-the-art approaches.

本文提出了一种基于混合形状几何、动态纹理和神经渲染的照片逼真头部模型的文本/语音驱动动画的新方法。训练用于几何和纹理的VAE产生用于从潜在特征向量精确捕捉和真实合成面部表情的参数模型。我们的动画方法基于条件CNN,它将文本或语音转换为一系列动画参数。与以前的方法相比,我们的动画模型以无监督的方式学习解开/合成不同的表演风格,只需要描述训练序列内容的语音标签。对于逼真的实时渲染,我们训练了一个U-Net,它通过计算改进的像素颜色和前景蒙版来改进基于光栅化的渲染。我们将我们的框架与最近的头部建模和面部动画方法进行了定性/定量比较,并在用户研究中评估了感知渲染/动画质量,这表明与最先进的方法相比有了很大的改进。
{"title":"Unsupervised learning of style-aware facial animation from real acting performances","authors":"Wolfgang Paier ,&nbsp;Anna Hilsmann ,&nbsp;Peter Eisert","doi":"10.1016/j.gmod.2023.101199","DOIUrl":"https://doi.org/10.1016/j.gmod.2023.101199","url":null,"abstract":"<div><p>This paper presents a novel approach for text/speech-driven animation of a photo-realistic head model based on blend-shape geometry, dynamic textures, and neural rendering. Training a VAE for geometry and texture yields a parametric model for accurate capturing and realistic synthesis of facial expressions from a latent feature vector. Our animation method is based on a conditional CNN that transforms text or speech into a sequence of animation parameters. In contrast to previous approaches, our animation model learns disentangling/synthesizing different acting-styles in an unsupervised manner, requiring only phonetic labels that describe the content of training sequences. For realistic real-time rendering, we train a U-Net that refines rasterization-based renderings by computing improved pixel colors and a foreground matte. We compare our framework qualitatively/quantitatively against recent methods for head modeling as well as facial animation and evaluate the perceived rendering/animation quality in a user-study, which indicates large improvements compared to state-of-the-art approaches.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"129 ","pages":"Article 101199"},"PeriodicalIF":1.7,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49889738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified shape and appearance reconstruction with joint camera parameter refinement 联合摄像机参数细化统一形状和外观重建
IF 1.7 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2023-10-01 Epub Date: 2023-08-13 DOI: 10.1016/j.gmod.2023.101193
Julian Kaltheuner, Patrick Stotko, Reinhard Klein

In this paper, we present an inverse rendering method for the simple reconstruction of shape and appearance of real-world objects from only roughly calibrated RGB images captured under collocated point light illumination. To this end, we gradually reconstruct the lower-frequency geometry information using automatically generated occupancy mask images based on a visual hull initialization of the mesh, to infer the object topology, and a smoothness-preconditioned optimization. By combining this geometry estimation with learning-based SVBRDF parameter inference as well as intrinsic and extrinsic camera parameter refinement in a joint and unified formulation, our novel method is able to reconstruct shape and an isotropic SVBRDF from fewer input images than previous methods. Unlike in other works, we also estimate normal maps as part of the SVBRDF to capture and represent higher-frequency geometric details in a compact way. Furthermore, by regularizing the appearance estimation with a GAN-based SVBRDF generator, we are able to meaningfully limit the solution space. In summary, this leads to a robust automatic reconstruction algorithm for shape and appearance. We evaluated our algorithm on synthetic as well as on real-world data and demonstrate that our method is able to reconstruct complex objects with high-fidelity reflection properties in a robust way, also in the presence of imperfect camera parameter data.

在本文中,我们提出了一种逆绘制方法,用于从在并置点光照明下捕获的粗略校准的RGB图像中简单重建现实世界物体的形状和外观。为此,我们基于网格的视觉船体初始化,使用自动生成的占用掩模图像逐步重建低频几何信息,推断物体拓扑,并进行平滑预处理优化。通过将这种几何估计与基于学习的SVBRDF参数推断以及在联合统一的公式中进行相机内外参数细化相结合,我们的新方法能够从比以前方法更少的输入图像中重建形状和各向同性SVBRDF。与其他作品不同,我们还估计法线贴图作为SVBRDF的一部分,以紧凑的方式捕获和表示高频几何细节。此外,通过使用基于gan的SVBRDF生成器对外观估计进行正则化,我们能够有效地限制解空间。综上所述,这导致了一种鲁棒的形状和外观自动重建算法。我们在合成数据和真实世界数据上评估了我们的算法,并证明我们的方法能够以稳健的方式重建具有高保真反射特性的复杂物体,即使存在不完美的相机参数数据。
{"title":"Unified shape and appearance reconstruction with joint camera parameter refinement","authors":"Julian Kaltheuner,&nbsp;Patrick Stotko,&nbsp;Reinhard Klein","doi":"10.1016/j.gmod.2023.101193","DOIUrl":"10.1016/j.gmod.2023.101193","url":null,"abstract":"<div><p>In this paper, we present an inverse rendering method for the simple reconstruction of shape and appearance of real-world objects from only roughly calibrated RGB images captured under collocated point light illumination. To this end, we gradually reconstruct the lower-frequency geometry information using automatically generated occupancy mask images based on a visual hull initialization of the mesh, to infer the object topology, and a smoothness-preconditioned optimization. By combining this geometry estimation with learning-based SVBRDF parameter inference as well as intrinsic and extrinsic camera parameter refinement in a joint and unified formulation, our novel method is able to reconstruct shape and an isotropic SVBRDF from fewer input images than previous methods. Unlike in other works, we also estimate normal maps as part of the SVBRDF to capture and represent higher-frequency geometric details in a compact way. Furthermore, by regularizing the appearance estimation with a GAN-based SVBRDF generator, we are able to meaningfully limit the solution space. In summary, this leads to a robust automatic reconstruction algorithm for shape and appearance. We evaluated our algorithm on synthetic as well as on real-world data and demonstrate that our method is able to reconstruct complex objects with high-fidelity reflection properties in a robust way, also in the presence of imperfect camera parameter data.</p></div>","PeriodicalId":55083,"journal":{"name":"Graphical Models","volume":"129 ","pages":"Article 101193"},"PeriodicalIF":1.7,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43340086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Graphical Models
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1