首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
tSPM-Net: A probabilistic spatio-temporal approach for scanpath prediction tSPM-Net:扫描路径预测的时空概率方法
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-22 DOI: 10.1016/j.cag.2024.103983
Daniel Martin, Diego Gutierrez, Belen Masia

Predicting the path followed by the viewer’s eyes when observing an image (a scanpath) is a challenging problem, particularly due to the inter- and intra-observer variability and the spatio-temporal dependencies of the visual attention process. Most existing approaches have focused on progressively optimizing the prediction of a gaze point given the previous ones. In this work we propose instead a probabilistic approach, which we call tSPM-Net. We build our method to account for observers’ variability by resorting to Bayesian deep learning and a probabilistic approach. Besides, we optimize our model to jointly consider both spatial and temporal dimensions of scanpaths using a novel spatio-temporal loss function based on a combination of Kullback–Leibler divergence and dynamic time warping. Our tSPM-Net yields results that outperform those of current state-of-the-art approaches, and are closer to the human baseline, suggesting that our model is able to generate scanpaths whose behavior closely resembles those of the real ones.

预测观察者在观察图像时眼睛所走的路径(扫描路径)是一个具有挑战性的问题,特别是由于观察者之间和观察者内部的差异性以及视觉注意力过程的时空依赖性。现有的大多数方法都侧重于根据之前的预测逐步优化注视点的预测。在这项工作中,我们提出了一种概率方法,我们称之为 tSPM-Net。通过贝叶斯深度学习和概率方法,我们建立了自己的方法来考虑观察者的可变性。此外,我们还利用基于库尔贝克-莱布勒发散和动态时间扭曲相结合的新型时空损失函数,对模型进行了优化,以共同考虑扫描路径的空间和时间维度。我们的 tSPM-Net 得出的结果优于目前最先进的方法,而且更接近人类基线,这表明我们的模型能够生成行为与真实路径非常相似的扫描路径。
{"title":"tSPM-Net: A probabilistic spatio-temporal approach for scanpath prediction","authors":"Daniel Martin,&nbsp;Diego Gutierrez,&nbsp;Belen Masia","doi":"10.1016/j.cag.2024.103983","DOIUrl":"https://doi.org/10.1016/j.cag.2024.103983","url":null,"abstract":"<div><p>Predicting the path followed by the viewer’s eyes when observing an image (a scanpath) is a challenging problem, particularly due to the inter- and intra-observer variability and the spatio-temporal dependencies of the visual attention process. Most existing approaches have focused on progressively optimizing the prediction of a gaze point given the previous ones. In this work we propose instead a probabilistic approach, which we call tSPM-Net. We build our method to account for observers’ variability by resorting to Bayesian deep learning and a probabilistic approach. Besides, we optimize our model to jointly consider both spatial and temporal dimensions of scanpaths using a novel spatio-temporal loss function based on a combination of Kullback–Leibler divergence and dynamic time warping. Our tSPM-Net yields results that outperform those of current state-of-the-art approaches, and are closer to the human baseline, suggesting that our model is able to generate scanpaths whose behavior closely resembles those of the real ones.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001183/pdfft?md5=63aa8280628676f6a3b43ea567f229a9&pid=1-s2.0-S0097849324001183-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141482729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Volumetric nonwoven structures: An algebraic framework for systematic design of infinite polyhedral frames using nonwoven fabric patterns 体积无纺布结构:使用无纺布图案系统设计无限多面体框架的代数框架
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-21 DOI: 10.1016/j.cag.2024.103979
Tolga Yildiz , Ergun Akleman

In this paper, we present an algebraic framework that can be used to construct a large class of 3D shapes and structures that can potentially provide unusual material properties. We formalized this framework as a 3D generalization of planar nonwoven textile structures that are used to mimic the woven structures. Our extension is based on the fact that it is straightforward to extend planar nonwoven textile structures into volumetric nonwoven textile structures, which we also call nonwoven volumetric fabrics. This property is essential because such an extension is impossible with planar woven structures. In other words, using this approach, it can be possible to easily produce volumetric structures that mimic the fabric behavior as if they were planar nonwoven textile structures, which is impossible to produce. These volumetric structures also correspond to regular & semiregular frame structures and are capable of representing previously unknown infinite regular polyhedra and flexible wood structures.

在本文中,我们提出了一个代数框架,可用于构建一大类三维形状和结构,从而提供不同寻常的材料特性。我们将这一框架形式化为平面无纺布纺织结构的三维扩展,用来模仿编织结构。我们的扩展基于以下事实:将平面无纺纺织品结构直接扩展为体积无纺纺织品结构(我们也称之为体积无纺纺织品)。这一特性至关重要,因为平面编织结构不可能实现这种扩展。换句话说,使用这种方法,可以很容易地生产出模仿织物行为的体积结构,就好像平面无纺布织物结构一样,而这是不可能生产出来的。这些体积结构还与正方形和半圆形框架结构相对应,并能表现以前未知的无限正多面体和柔性木结构。
{"title":"Volumetric nonwoven structures: An algebraic framework for systematic design of infinite polyhedral frames using nonwoven fabric patterns","authors":"Tolga Yildiz ,&nbsp;Ergun Akleman","doi":"10.1016/j.cag.2024.103979","DOIUrl":"https://doi.org/10.1016/j.cag.2024.103979","url":null,"abstract":"<div><p>In this paper, we present an algebraic framework that can be used to construct a large class of 3D shapes and structures that can potentially provide unusual material properties. We formalized this framework as a 3D generalization of planar nonwoven textile structures that are used to mimic the woven structures. Our extension is based on the fact that it is straightforward to extend planar nonwoven textile structures into volumetric nonwoven textile structures, which we also call nonwoven volumetric fabrics. This property is essential because such an extension is impossible with planar woven structures. In other words, using this approach, it can be possible to easily produce volumetric structures that mimic the fabric behavior as if they were planar nonwoven textile structures, which is impossible to produce. These volumetric structures also correspond to regular &amp; semiregular frame structures and are capable of representing previously unknown infinite regular polyhedra and flexible wood structures.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141482727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GLHDR: HDR video reconstruction driven by global to local alignment strategy GLHDR:全局到局部对齐策略驱动的 HDR 视频重建
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-19 DOI: 10.1016/j.cag.2024.103980
Tengyao Cui , Yongfang Wang , Yingjie Yang , Yihan Wang

Reconstructing High Dynamic Range (HDR) video from alternating exposure Low Dynamic Range (LDR) sequence is an exceptionally challenging task. It not only demands the reliable reconstruction of missing information caused by occlusion or motion without introducing artifacts but also balances the exposure differences between frames to ensure a visually pleasing reconstructed HDR video. Unfortunately, existing methods are typically complex and struggle with unavoidable artifacts and noise, especially when dealing with low-exposed scenes. To tackle this formidable challenge, we propose a two-stage HDR video reconstruction method that employs a global to local alignment strategy. Firstly, we utilize iterative optical flow estimation and hybrid weighting to achieve global alignment, ensuring well-reconstructed in majority of areas. Secondly, the recursive refinement network further addresses locally misaligned areas, reconstructing HDR frames from bottom to top and recursively refining them to yield faithful reconstruction results. Extensive experimental results demonstrate that our method generates the HDR video with fine details and superior visually, surpassing the state-of-the-art method across diverse scenes.

从交替曝光的低动态范围(LDR)序列中重建高动态范围(HDR)视频是一项极具挑战性的任务。它不仅要求在不引入伪像的情况下可靠地重建因遮挡或运动造成的缺失信息,还要求平衡帧间的曝光差异,以确保重建的 HDR 视频在视觉上赏心悦目。遗憾的是,现有的方法通常都很复杂,而且难以避免伪影和噪点,尤其是在处理低曝光场景时。为了应对这一严峻挑战,我们提出了一种两阶段 HDR 视频重建方法,该方法采用了从全局到局部的对齐策略。首先,我们利用迭代光流估计和混合加权来实现全局配准,确保大部分区域的重建效果良好。其次,递归细化网络进一步解决局部不对齐区域的问题,从下往上重建 HDR 帧并递归细化,以获得忠实的重建结果。广泛的实验结果表明,我们的方法生成的 HDR 视频细节细腻,视觉效果出众,在各种场景下都超越了最先进的方法。
{"title":"GLHDR: HDR video reconstruction driven by global to local alignment strategy","authors":"Tengyao Cui ,&nbsp;Yongfang Wang ,&nbsp;Yingjie Yang ,&nbsp;Yihan Wang","doi":"10.1016/j.cag.2024.103980","DOIUrl":"https://doi.org/10.1016/j.cag.2024.103980","url":null,"abstract":"<div><p>Reconstructing High Dynamic Range (HDR) video from alternating exposure Low Dynamic Range (LDR) sequence is an exceptionally challenging task. It not only demands the reliable reconstruction of missing information caused by occlusion or motion without introducing artifacts but also balances the exposure differences between frames to ensure a visually pleasing reconstructed HDR video. Unfortunately, existing methods are typically complex and struggle with unavoidable artifacts and noise, especially when dealing with low-exposed scenes. To tackle this formidable challenge, we propose a two-stage HDR video reconstruction method that employs a global to local alignment strategy. Firstly, we utilize iterative optical flow estimation and hybrid weighting to achieve global alignment, ensuring well-reconstructed in majority of areas. Secondly, the recursive refinement network further addresses locally misaligned areas, reconstructing HDR frames from bottom to top and recursively refining them to yield faithful reconstruction results. Extensive experimental results demonstrate that our method generates the HDR video with fine details and superior visually, surpassing the state-of-the-art method across diverse scenes.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141482730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale Knowledge Transfer Vision Transformer for 3D vessel shape segmentation 用于三维血管形状分割的多尺度知识转移视觉转换器
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-19 DOI: 10.1016/j.cag.2024.103976
Michael J. Hua , Junjie Wu , Zichun Zhong

In order to facilitate the robust and precise 3D vessel shape extraction and quantification from in-vivo Magnetic Resonance Imaging (MRI), this paper presents a novel multi-scale Knowledge Transfer Vision Transformer (i.e., KT-ViT) for 3D vessel shape segmentation. First, it uniquely integrates convolutional embeddings with transformer in a U-net architecture, which simultaneously responds to local receptive fields with convolution layers and global contexts with transformer encoders in a multi-scale fashion. Therefore, it intrinsically enriches local vessel feature and simultaneously promotes global connectivity and continuity for a more accurate and reliable vessel shape segmentation. Furthermore, to enable using relatively low-resolution (LR) images to segment fine scale vessel shapes, a novel knowledge transfer network is designed to explore the inter-dependencies of data and automatically transfer the knowledge gained from high-resolution (HR) data to the low-resolution handling network at multiple levels, including the multi-scale feature levels and the decision level, through an integration of multi-level loss functions. The modeling capability of fine-scale vessel shape data distribution, possessed by the HR image transformer network, can be transferred to the LR image transformer to enhance its knowledge for fine vessel shape segmentation. Extensive experimental results on public image datasets have demonstrated that our method outperforms all other state-of-the-art deep learning methods.

为了促进从体内磁共振成像(MRI)中提取和量化稳健而精确的三维血管形状,本文提出了一种用于三维血管形状分割的新型多尺度知识转移视觉变换器(即 KT-ViT)。首先,它独特地将卷积嵌入与变换器整合在一个 U 型网络架构中,同时以多尺度方式用卷积层响应局部感受野,用变换器编码器响应全局上下文。因此,它从本质上丰富了局部血管特征,同时促进了全局连接性和连续性,从而实现了更准确、更可靠的血管形状分割。此外,为了能够使用相对低分辨率(LR)图像来分割精细尺度的血管形状,设计了一个新颖的知识转移网络来探索数据之间的相互依赖关系,并通过多级损失函数的集成,自动将从高分辨率(HR)数据中获得的知识转移到低分辨率处理网络的多个级别,包括多尺度特征级别和决策级别。高分辨率图像转换器网络所拥有的细尺度血管形状数据分布建模能力,可以转移到低分辨率图像转换器中,以增强其在细血管形状分割方面的知识。在公共图像数据集上的大量实验结果表明,我们的方法优于所有其他最先进的深度学习方法。
{"title":"Multi-scale Knowledge Transfer Vision Transformer for 3D vessel shape segmentation","authors":"Michael J. Hua ,&nbsp;Junjie Wu ,&nbsp;Zichun Zhong","doi":"10.1016/j.cag.2024.103976","DOIUrl":"https://doi.org/10.1016/j.cag.2024.103976","url":null,"abstract":"<div><p>In order to facilitate the robust and precise 3D vessel shape extraction and quantification from in-vivo Magnetic Resonance Imaging (MRI), this paper presents a novel multi-scale Knowledge Transfer Vision Transformer (i.e., KT-ViT) for 3D vessel shape segmentation. First, it uniquely integrates convolutional embeddings with transformer in a U-net architecture, which simultaneously responds to local receptive fields with convolution layers and global contexts with transformer encoders in a multi-scale fashion. Therefore, it intrinsically enriches local vessel feature and simultaneously promotes global connectivity and continuity for a more accurate and reliable vessel shape segmentation. Furthermore, to enable using relatively low-resolution (LR) images to segment fine scale vessel shapes, a novel knowledge transfer network is designed to explore the inter-dependencies of data and automatically transfer the knowledge gained from high-resolution (HR) data to the low-resolution handling network at multiple levels, including the multi-scale feature levels and the decision level, through an integration of multi-level loss functions. The modeling capability of fine-scale vessel shape data distribution, possessed by the HR image transformer network, can be transferred to the LR image transformer to enhance its knowledge for fine vessel shape segmentation. Extensive experimental results on public image datasets have demonstrated that our method outperforms all other state-of-the-art deep learning methods.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141482726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3D sketching in immersive environments: Shape from disordered ribbon strokes 临时删除:身临其境的三维素描:从无序的带状笔画中塑造形象
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-18 DOI: 10.1016/j.cag.2024.103978

Immersive environments with head mounted displays (HMD) and hand-held controllers, either in Virtual or Augmented Reality (VR/AR), offer new possibilities for the creation of artistic 3D content. Some of them are exploited by mid-air drawing applications: the user’s hand trajectory generates a set of stylized curves or ribbons in space, giving the impression of painting or drawing in 3D. We propose a method to extend this approach to the sketching of surfaces with a VR controller. The idea is to favor shape exploration, offering a tool, where the user creates a surface just by painting ribbons. These ribbons are not constrained to form patch boundaries for example or to completely cover the shape. They can be very sparse, disordered, overlap or not, intersect or not. The shape is computed simultaneously, starting with the first piece of ribbon drawn by the user and continuing to evolve in real-time as long as the user continues sketching. Our method involves minimizing an energy function based on the projections of the ribbon strokes on a proxy surface by taking the controller’s orientations into account. The current implementation considers elevation surfaces. In addition to many examples, we evaluate the time performance of the dynamic shape modeling with respect to an increasing number of input ribbon strokes. Finally, we present images of an artistic creation that combines stylized curve drawings in VR with our surface sketching tool created by a professional artist.

在虚拟现实或增强现实(VR/AR)中,带有头戴式显示器(HMD)和手持控制器的沉浸式环境为三维艺术内容的创作提供了新的可能性。半空中绘画应用就利用了其中的一些可能性:用户的手部轨迹会在空间中生成一组风格化的曲线或缎带,给人一种在三维空间中绘画的感觉。我们提出了一种方法,将这种方法扩展到使用 VR 控制器进行表面素描。我们的想法是通过提供一种工具来帮助用户探索形状,在这种工具中,用户只需绘制色带即可创建一个表面。这些色带并不受限制,例如可以形成斑块边界,也可以完全覆盖形状。它们可以非常稀疏、无序、重叠或不重叠、相交或不相交。形状的计算是同步进行的,从用户绘制的第一条色带开始,只要用户继续绘制草图,形状就会继续实时演变。我们的方法是根据色带笔画在代理表面上的投影,考虑控制器的方向,最小化能量函数。目前的实现方法考虑了高程表面。除了许多示例之外,我们还评估了动态形状建模在输入色带笔画数量增加时的时间性能。最后,我们展示了一个艺术创作的图片,它将 VR 中的风格化曲线图与我们的表面素描工具相结合,由专业艺术家创作。
{"title":"3D sketching in immersive environments: Shape from disordered ribbon strokes","authors":"","doi":"10.1016/j.cag.2024.103978","DOIUrl":"10.1016/j.cag.2024.103978","url":null,"abstract":"<div><p>Immersive environments with head mounted displays (HMD) and hand-held controllers, either in Virtual or Augmented Reality (VR/AR), offer new possibilities for the creation of artistic 3D content. Some of them are exploited by mid-air drawing applications: the user’s hand trajectory generates a set of stylized curves or ribbons in space, giving the impression of painting or drawing in 3D. We propose a method to extend this approach to the sketching of surfaces with a VR controller. The idea is to favor shape exploration, offering a tool, where the user creates a surface just by painting ribbons. These ribbons are not constrained to form patch boundaries for example or to completely cover the shape. They can be very sparse, disordered, overlap or not, intersect or not. The shape is computed simultaneously, starting with the first piece of ribbon drawn by the user and continuing to evolve in real-time as long as the user continues sketching. Our method involves minimizing an energy function based on the projections of the ribbon strokes on a proxy surface by taking the controller’s orientations into account. The current implementation considers elevation surfaces. In addition to many examples, we evaluate the time performance of the dynamic shape modeling with respect to an increasing number of input ribbon strokes. Finally, we present images of an artistic creation that combines stylized curve drawings in VR with our surface sketching tool created by a professional artist.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001134/pdfft?md5=b8947c31f44eb00638419d8108706433&pid=1-s2.0-S0097849324001134-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141571365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differentiable microstructures design via anisotropic thermal diffusion 通过各向异性热扩散设计可变微结构
IF 2.5 4区 计算机科学 Q1 Engineering Pub Date : 2024-06-15 DOI: 10.1016/j.cag.2024.103977
Qi Wang, Qing Fang, Xiaoya Zhai, Ligang Liu, Xiao-Ming Fu

We propose a novel method to design differentiable microstructures. Central to our algorithm is a new representation of the mapping from the parameters to microstructures, formulated as the anisotropic thermal diffusion. A metric field governs the anisotropic diffusion. The metric associated with each point is represented as a 2 × 2 symmetric positive definite matrix that becomes the design variable. To alleviate the difficulties caused by symmetric positive definite constraints, we perform the singular value decomposition of the metric matrix so that the design variable includes a rotation angle and a diagonal matrix. Then, the positive definiteness is converted to requiring the two diagonal entries of the diagonal matrix to be positive, which is easier to deal with. The effectiveness of our algorithm is demonstrated through evaluations and comparisons over various examples.

我们提出了一种设计可微分微结构的新方法。我们算法的核心是一种从参数到微结构映射的新表示法,即各向异性热扩散。各向异性热扩散由一个度量场控制。与每个点相关的度量表示为 2 × 2 对称正定矩阵,该矩阵成为设计变量。为了减轻对称正定约束带来的困难,我们对度量矩阵进行奇异值分解,使设计变量包括一个旋转角和一个对角矩阵。然后,将正定性转换为要求对角矩阵的两个对角项为正,这样就更容易处理了。通过对各种实例的评估和比较,证明了我们算法的有效性。
{"title":"Differentiable microstructures design via anisotropic thermal diffusion","authors":"Qi Wang,&nbsp;Qing Fang,&nbsp;Xiaoya Zhai,&nbsp;Ligang Liu,&nbsp;Xiao-Ming Fu","doi":"10.1016/j.cag.2024.103977","DOIUrl":"10.1016/j.cag.2024.103977","url":null,"abstract":"<div><p>We propose a novel method to design differentiable microstructures. Central to our algorithm is a new representation of the mapping from the parameters to microstructures, formulated as the anisotropic thermal diffusion. A metric field governs the anisotropic diffusion. The metric associated with each point is represented as a 2 × 2 symmetric positive definite matrix that becomes the design variable. To alleviate the difficulties caused by symmetric positive definite constraints, we perform the singular value decomposition of the metric matrix so that the design variable includes a rotation angle and a diagonal matrix. Then, the positive definiteness is converted to requiring the two diagonal entries of the diagonal matrix to be positive, which is easier to deal with. The effectiveness of our algorithm is demonstrated through evaluations and comparisons over various examples.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141409852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FeaST: Feature-guided Style Transfer for high-fidelity art synthesis FeaST:高保真艺术合成的特征引导风格转移
IF 2.5 4区 计算机科学 Q1 Engineering Pub Date : 2024-06-13 DOI: 10.1016/j.cag.2024.103975
Wen Hao Png, Yichiet Aun, Ming Lee Gan

Text-conditioned image synthesis methods such as DALLE-2, IMAGEN, and Stable Diffusion are gaining strong attention from deep learning and art communities recently. Meanwhile, Image-to-Image (Img2Img) synthesis applications that emerged from the pioneering Neural Style Transfer (NST) approach have swiftly transitioned towards the feed-forward Automatic Style Transfer (AST) methods, due to numerous constraints inherent in the former method, including inconsistent synthesis outcomes and sluggish optimization-based synthesis process. However, NST holds significant potential yet remains relatively underexplored within this research domain. In this paper, we revisited the original NST method and uncovered its potential to attain image quality comparable to the AST synthesis methods across a diverse range of artistic styles. We propose a two-stage Feature-guided Style Transfer (FeaST) which consists (a) pre-stylization step called Sketching to address the poor initialization issue, and (b) Finetuning to guide the synthesis process based on high-frequency (HF) and low-frequency (LF) guidance channels. By addressing the issues of inconsistent synthesis and slow convergence inherent in the original method, FeaST unlocks the full capabilities of NST and significantly enhances its efficiency.

最近,DALLE-2、IMAGEN 和稳定扩散等以文本为条件的图像合成方法受到了深度学习和艺术界的强烈关注。与此同时,从开创性的神经风格转换(NST)方法中产生的图像到图像(Img2Img)合成应用已迅速过渡到前馈式自动风格转换(AST)方法,原因在于前者固有的诸多限制,包括合成结果不一致和基于优化的合成过程缓慢。然而,NST 具有巨大的潜力,但在这一研究领域仍相对缺乏探索。在本文中,我们重新审视了最初的 NST 方法,并发现了它在不同艺术风格中实现与 AST 合成方法相媲美的图像质量的潜力。我们提出了一种两阶段特征引导风格转换(FeaST)方法,其中包括:(a)称为 "草图 "的风格化前步骤,以解决初始化不佳的问题;(b)基于高频(HF)和低频(LF)引导通道的微调,以引导合成过程。通过解决原始方法中固有的合成不一致和收敛速度慢的问题,FeaST 释放了 NST 的全部能力,并显著提高了其效率。
{"title":"FeaST: Feature-guided Style Transfer for high-fidelity art synthesis","authors":"Wen Hao Png,&nbsp;Yichiet Aun,&nbsp;Ming Lee Gan","doi":"10.1016/j.cag.2024.103975","DOIUrl":"10.1016/j.cag.2024.103975","url":null,"abstract":"<div><p>Text-conditioned image synthesis methods such as DALLE-2, IMAGEN, and Stable Diffusion are gaining strong attention from deep learning and art communities recently. Meanwhile, Image-to-Image (Img2Img) synthesis applications that emerged from the pioneering Neural Style Transfer (NST) approach have swiftly transitioned towards the feed-forward Automatic Style Transfer (AST) methods, due to numerous constraints inherent in the former method, including inconsistent synthesis outcomes and sluggish optimization-based synthesis process. However, NST holds significant potential yet remains relatively underexplored within this research domain. In this paper, we revisited the original NST method and uncovered its potential to attain image quality comparable to the AST synthesis methods across a diverse range of artistic styles. We propose a two-stage Feature-guided Style Transfer (FeaST) which consists (a) pre-stylization step called <em>Sketching</em> to address the poor initialization issue, and (b) <em>Finetuning</em> to guide the synthesis process based on high-frequency (HF) and low-frequency (LF) guidance channels. By addressing the issues of inconsistent synthesis and slow convergence inherent in the original method, FeaST unlocks the full capabilities of NST and significantly enhances its efficiency.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141401078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UVS-CNNs: Constructing general convolutional neural networks on quasi-uniform spherical images UVS-CNNs:在准均匀球形图像上构建通用卷积神经网络
IF 2.5 4区 计算机科学 Q1 Engineering Pub Date : 2024-06-13 DOI: 10.1016/j.cag.2024.103973
Yusheng Yang , Zhiyuan Gao , Jinghan Zhang , Wenbo Hui , Hang Shi , Yangmin Xie

Omnidirectional images, also known as spherical images, offer a significant advantage for the environmental sensing of mobile robots due to their wide field of view. However, previous studies of constructing convolutional neural networks on spherical images have been limited by non-uniform pixel sampling, leading to suboptimal performance in semantic segmentation. To address this issue, a novel pixel segmentation approach is proposed to achieve near-uniform pixel distribution across the entire spherical surface. The corresponding convolution operation for the resulting image is designed as well, which extends the capabilities of spherical CNNs from semantic segmentation to more complex tasks such as instance segmentation. The method is evaluated on the Stanford 2D3DS dataset and shows superior performance compared to conventional spherical CNNs. Furthermore, the method also achieves impressive instance segmentation results on our experimental LiDAR data, demonstrating the general feasibility of our approach for common CNN tasks. The related code and dataset are released in the following link: https://github.com/YoungRainy/UVS-U-Net.

全向图像(又称球形图像)因其视野开阔,为移动机器人的环境感知提供了显著优势。然而,以往在球形图像上构建卷积神经网络的研究受到非均匀像素采样的限制,导致语义分割的性能不理想。为解决这一问题,我们提出了一种新颖的像素分割方法,以实现整个球形表面近乎均匀的像素分布。此外,还为生成的图像设计了相应的卷积操作,从而将球形 CNN 的功能从语义分割扩展到实例分割等更复杂的任务。该方法在斯坦福 2D3DS 数据集上进行了评估,与传统的球形 CNN 相比,表现出更优越的性能。此外,该方法还在我们的激光雷达实验数据上取得了令人印象深刻的实例分割结果,证明了我们的方法在普通 CNN 任务中的普遍可行性。相关代码和数据集发布于以下链接:https://github.com/YoungRainy/UVS-U-Net。
{"title":"UVS-CNNs: Constructing general convolutional neural networks on quasi-uniform spherical images","authors":"Yusheng Yang ,&nbsp;Zhiyuan Gao ,&nbsp;Jinghan Zhang ,&nbsp;Wenbo Hui ,&nbsp;Hang Shi ,&nbsp;Yangmin Xie","doi":"10.1016/j.cag.2024.103973","DOIUrl":"10.1016/j.cag.2024.103973","url":null,"abstract":"<div><p>Omnidirectional images, also known as spherical images, offer a significant advantage for the environmental sensing of mobile robots due to their wide field of view. However, previous studies of constructing convolutional neural networks on spherical images have been limited by non-uniform pixel sampling, leading to suboptimal performance in semantic segmentation. To address this issue, a novel pixel segmentation approach is proposed to achieve near-uniform pixel distribution across the entire spherical surface. The corresponding convolution operation for the resulting image is designed as well, which extends the capabilities of spherical CNNs from semantic segmentation to more complex tasks such as instance segmentation. The method is evaluated on the Stanford 2D3DS dataset and shows superior performance compared to conventional spherical CNNs. Furthermore, the method also achieves impressive instance segmentation results on our experimental LiDAR data, demonstrating the general feasibility of our approach for common CNN tasks. The related code and dataset are released in the following link: <span>https://github.com/YoungRainy/UVS-U-Net</span><svg><path></path></svg>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141399702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coherent point drift with Skewed Distribution for accurate point cloud registration 用于精确点云注册的具有倾斜分布的相干点漂移
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2024-06-13 DOI: 10.1016/j.cag.2024.103974
Zhuoran Wang, Jianjun Yi, Lin Su, Yihan Pan

Point cloud registration methods based on Gaussian Mixture Models (GMMs) exhibit high robustness. However, GMM cannot precisely depict point clouds, because the Gaussian distribution is spatially symmetric and local surfaces of point clouds are typically non-symmetric. In this paper, we propose a novel method for rigid point cloud registration, termed coherent point drift with Skewed Distribution (Skewed CPD). Our method employs an asymmetric distribution constructed from the local surface normals and curvature radii. Compared to the Gaussian distribution, this skewed distribution provides a more accurate spatial description of points on local surfaces. Additionally, we integrate an adaptive multiplier to the covariance, which reallocates the weight of the covariance for different components in the probabilistic mixture model. We employ the EM algorithm to address this maximum likelihood estimation (MLE) issue and leverage GPU acceleration. In the M-step, we adopt an unconstrained optimization technique rooted in a Lie group and Lie algebra to attain the optimal transformation. Experimental results indicate that our method outperforms state-of-the-art methods in both accuracy and robustness. Remarkably, even without loop closure detection, the cumulative error of our approach remains minimal.

基于高斯混合模型(GMM)的点云注册方法具有很高的鲁棒性。然而,GMM 无法精确描绘点云,因为高斯分布在空间上是对称的,而点云的局部表面通常是不对称的。在本文中,我们提出了一种用于刚性点云注册的新方法,即倾斜分布相干点漂移(Skewed CPD)。我们的方法采用了由局部表面法线和曲率半径构建的非对称分布。与高斯分布相比,这种倾斜分布能对局部表面上的点进行更精确的空间描述。此外,我们还在协方差中加入了自适应乘数,重新分配概率混合模型中不同成分的协方差权重。我们采用 EM 算法来解决最大似然估计 (MLE) 问题,并利用 GPU 加速。在 M 步中,我们采用了植根于李群和李代数的无约束优化技术,以实现最优变换。实验结果表明,我们的方法在准确性和鲁棒性方面都优于最先进的方法。值得注意的是,即使没有闭环检测,我们方法的累积误差仍然很小。
{"title":"Coherent point drift with Skewed Distribution for accurate point cloud registration","authors":"Zhuoran Wang,&nbsp;Jianjun Yi,&nbsp;Lin Su,&nbsp;Yihan Pan","doi":"10.1016/j.cag.2024.103974","DOIUrl":"10.1016/j.cag.2024.103974","url":null,"abstract":"<div><p>Point cloud registration methods based on Gaussian Mixture Models (GMMs) exhibit high robustness. However, GMM cannot precisely depict point clouds, because the Gaussian distribution is spatially symmetric and local surfaces of point clouds are typically non-symmetric. In this paper, we propose a novel method for rigid point cloud registration, termed coherent point drift with Skewed Distribution (Skewed CPD). Our method employs an asymmetric distribution constructed from the local surface normals and curvature radii. Compared to the Gaussian distribution, this skewed distribution provides a more accurate spatial description of points on local surfaces. Additionally, we integrate an adaptive multiplier to the covariance, which reallocates the weight of the covariance for different components in the probabilistic mixture model. We employ the EM algorithm to address this maximum likelihood estimation (MLE) issue and leverage GPU acceleration. In the M-step, we adopt an unconstrained optimization technique rooted in a Lie group and Lie algebra to attain the optimal transformation. Experimental results indicate that our method outperforms state-of-the-art methods in both accuracy and robustness. Remarkably, even without loop closure detection, the cumulative error of our approach remains minimal.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141411340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retargeting of facial model for unordered dense point cloud 为无序密集点云重新定位面部模型
IF 2.5 4区 计算机科学 Q1 Engineering Pub Date : 2024-06-12 DOI: 10.1016/j.cag.2024.103972
Yuping Ye , Juncheng Han , Jixin Liang , Di Wu , Zhan Song

Facial retargeting is a widely used technique in the game and film industries that replicates the expressions of a source facial model onto a target model. Existing methods for facial retargeting rely on either hand-crafted uniform triangle meshes or sparse points obtained from motion capture(mocap). In this paper, we propose an end-to-end facial retargeting algorithm that copies facial expressions from unordered dense point clouds onto the target model. First, a corresponding building method based on bi-harmonic function is introduced to ensure that the template model and a cluster of point clouds share the same triangle topology. Second, a deformation transferring method is presented to transfer the calculated deformation onto the target model. Several experiments are conducted on the SIAT-3DFE dataset to demonstrate the accuracy and efficiency of our method.

面部重定向是游戏和电影行业广泛使用的一种技术,它能将源面部模型的表情复制到目标模型上。现有的面部重定位方法依赖于手工制作的统一三角形网格或从动作捕捉(mocap)中获得的稀疏点。在本文中,我们提出了一种端到端的面部重定位算法,它能将无序密集点云中的面部表情复制到目标模型上。首先,我们引入了一种基于双谐波函数的相应构建方法,以确保模板模型和点云簇具有相同的三角形拓扑结构。其次,提出了一种变形转移方法,将计算出的变形转移到目标模型上。我们在 SIAT-3DFE 数据集上进行了多次实验,以证明我们方法的准确性和高效性。
{"title":"Retargeting of facial model for unordered dense point cloud","authors":"Yuping Ye ,&nbsp;Juncheng Han ,&nbsp;Jixin Liang ,&nbsp;Di Wu ,&nbsp;Zhan Song","doi":"10.1016/j.cag.2024.103972","DOIUrl":"10.1016/j.cag.2024.103972","url":null,"abstract":"<div><p>Facial retargeting is a widely used technique in the game and film industries that replicates the expressions of a source facial model onto a target model. Existing methods for facial retargeting rely on either hand-crafted uniform triangle meshes or sparse points obtained from motion capture(mocap). In this paper, we propose an end-to-end facial retargeting algorithm that copies facial expressions from unordered dense point clouds onto the target model. First, a corresponding building method based on bi-harmonic function is introduced to ensure that the template model and a cluster of point clouds share the same triangle topology. Second, a deformation transferring method is presented to transfer the calculated deformation onto the target model. Several experiments are conducted on the SIAT-3DFE dataset to demonstrate the accuracy and efficiency of our method.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141402940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1