Computer Animation and Virtual Worlds最新文献

英文中文

An Improved Social Force Model-Driven Multi-Agent Generative Adversarial Imitation Learning Framework for Pedestrian Trajectory Prediction 一种改进的社会力模型驱动的多智能体生成对抗模仿学习框架用于行人轨迹预测

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-09 DOI: 10.1002/cav.70058

Wen Zhou, Wangyu Shen, Xinyi Meng

Recently, crowd trajectory prediction has attracted increasing attention. In particular, the simulation of pedestrian movement in scenarios such as crowd evacuation has gained increasing focus. The social force model is a promising and effective method for predicting the stochastic movement of pedestrians. However, individual heterogeneity, group-driven cooperation, and poor self-adaptive environmental interactive capabilities have not been comprehensively considered. This often makes it difficult to reproduce real scenarios. Therefore, a group-enabled social force model-driven multi-agent generative adversarial imitation learning framework, namely, SFMAGAIL, is proposed. Specifically, (1) a group-enabled individual heterogeneity schema is utilized to obtain related expert trajectories, which are fully incorporated into the desire force and group-enabled paradigms; (2) A joint policy is used to exploit the connection between the agents and the environment; and (3) To explore the intrinsic features of expert trajectories, an actor–critic-based multi-agent adversarial imitation learning framework is presented to generate effective trajectories. Finally, extensive experiments based on 2D and 3D virtual scenarios are conducted to validate our method. The results show that our proposed method is superior to the compared methods.

近年来，人群轨迹预测越来越受到人们的关注。特别是在人群疏散等场景中行人运动的模拟越来越受到关注。社会力模型是预测行人随机运动的一种有效方法。然而，个体异质性、群体驱动型合作、自适应环境交互能力差等因素并未得到全面考虑。这通常会使再现真实场景变得困难。因此，本文提出了一个群体支持的社会力量模型驱动的多智能体生成对抗模仿学习框架SFMAGAIL。具体而言，(1)利用群体赋能的个体异质性图式获得相关的专家轨迹，该轨迹充分融入了欲望力范式和群体赋能范式；(2)采用联合策略，利用agent与环境之间的联系；(3)为了探索专家轨迹的内在特征，提出了一种基于行动者批判的多智能体对抗模仿学习框架来生成有效的轨迹。最后，基于二维和三维虚拟场景的大量实验验证了我们的方法。结果表明，本文提出的方法优于比较方法。

{"title":"An Improved Social Force Model-Driven Multi-Agent Generative Adversarial Imitation Learning Framework for Pedestrian Trajectory Prediction","authors":"Wen Zhou, Wangyu Shen, Xinyi Meng","doi":"10.1002/cav.70058","DOIUrl":"https://doi.org/10.1002/cav.70058","url":null,"abstract":"<div>\u0000 \u0000 <p>Recently, crowd trajectory prediction has attracted increasing attention. In particular, the simulation of pedestrian movement in scenarios such as crowd evacuation has gained increasing focus. The social force model is a promising and effective method for predicting the stochastic movement of pedestrians. However, individual heterogeneity, group-driven cooperation, and poor self-adaptive environmental interactive capabilities have not been comprehensively considered. This often makes it difficult to reproduce real scenarios. Therefore, a group-enabled social force model-driven multi-agent generative adversarial imitation learning framework, namely, SFMAGAIL, is proposed. Specifically, (1) a group-enabled individual heterogeneity schema is utilized to obtain related expert trajectories, which are fully incorporated into the desire force and group-enabled paradigms; (2) A joint policy is used to exploit the connection between the agents and the environment; and (3) To explore the intrinsic features of expert trajectories, an actor–critic-based multi-agent adversarial imitation learning framework is presented to generate effective trajectories. Finally, extensive experiments based on 2D and 3D virtual scenarios are conducted to validate our method. The results show that our proposed method is superior to the compared methods.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144244319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimized Multiuser Panoramic Video Transmission in VR: A Machine Learning-Driven Approach VR中优化的多用户全景视频传输：一种机器学习驱动的方法

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-07 DOI: 10.1002/cav.70060

Wei Xun, Songlin Zhang

In this paper, we propose a machine learning-driven model to optimize panoramic video transmission for multiple users in virtual reality environments. The model predicts users' future field of view (FOV) using historical head orientation data and video saliency information, enabling targeted video delivery based on individual perspectives. By segmenting panoramic videos into tiles and applying a pyramid coding scheme, we adaptively transmit high-quality content within users' FOVs while utilizing lower-quality transmissions for peripheral regions. This approach effectively reduces bandwidth consumption while maintaining a high-quality viewing experience. Our experimental results demonstrate that combining user viewpoint data with video saliency features significantly improves long-term FOV prediction accuracy, leading to a more efficient and user-centric transmission model. The proposed method holds great potential for enhancing the immersive experience of panoramic video streaming in VR, particularly in bandwidth-constrained environments.

在本文中，我们提出了一种机器学习驱动的模型来优化虚拟现实环境中多用户全景视频传输。该模型使用历史头部方向数据和视频显著性信息来预测用户未来的视场（FOV），从而实现基于个人视角的有针对性的视频传输。通过将全景视频分割成小块并应用金字塔编码方案，我们自适应地在用户的fov内传输高质量的内容，同时在外围区域使用低质量的传输。这种方法有效地减少了带宽消耗，同时保持了高质量的观看体验。我们的实验结果表明，将用户视点数据与视频显著性特征相结合可以显著提高长期视点预测精度，从而实现更高效、以用户为中心的传输模型。所提出的方法在增强VR全景视频流的沉浸式体验方面具有很大的潜力，特别是在带宽受限的环境中。

引用次数: 0

CLPFusion: A Latent Diffusion Model Framework for Realistic Chinese Landscape Painting Style Transfer CLPFusion：中国写实山水画风格转移的潜在扩散模型框架

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-07 DOI: 10.1002/cav.70053

Jiahui Pan, Frederick W. B. Li, Bailin Yang, Fangzhe Nan

This study focuses on transforming real-world scenery into Chinese landscape painting masterpieces through style transfer. Traditional methods using convolutional neural networks (CNNs) and generative adversarial networks (GANs) often yield inconsistent patterns and artifacts. The rise of diffusion models (DMs) presents new opportunities for realistic image generation, but their inherent noise characteristics make it challenging to synthesize pure white or black images. Consequently, existing DM-based methods struggle to capture the unique style and color information of Chinese landscape paintings. To overcome these limitations, we propose CLPFusion, a novel framework that leverages pre-trained diffusion models for artistic style transfer. A key innovation is the Bidirectional State Space Models-CrossAttention (BiSSM-CA) module, which efficiently learns and retains the distinct styles of Chinese landscape paintings. Additionally, we introduce two latent space feature adjustment methods, Latent-AdaIN and Latent-WCT, to enhance style modulation during inference. Experiments demonstrate that CLPFusion produces more realistic and artistic Chinese landscape paintings than existing approaches, showcasing its effectiveness and uniqueness in the field.

本研究的重点是通过风格转换将现实世界的风景转化为中国山水画的杰作。使用卷积神经网络（cnn）和生成对抗网络（gan）的传统方法经常产生不一致的模式和伪影。扩散模型（DMs）的兴起为真实感图像生成提供了新的机会，但其固有的噪声特性使得合成纯白色或纯黑色图像具有挑战性。因此，现有的基于数据的方法很难捕捉到中国山水画独特的风格和色彩信息。为了克服这些限制，我们提出了CLPFusion，这是一个利用预先训练的扩散模型进行艺术风格转移的新框架。一个关键的创新是双向状态空间模型-交叉注意（BiSSM-CA）模块，它有效地学习和保留了中国山水画的独特风格。此外，我们还引入了两种潜在空间特征调整方法，即latent - adain和latent - wct，以增强推理过程中的风格调制。实验表明，CLPFusion比现有的方法产生的中国山水画更真实、更有艺术性，显示了其在该领域的有效性和独特性。

{"title":"CLPFusion: A Latent Diffusion Model Framework for Realistic Chinese Landscape Painting Style Transfer","authors":"Jiahui Pan, Frederick W. B. Li, Bailin Yang, Fangzhe Nan","doi":"10.1002/cav.70053","DOIUrl":"https://doi.org/10.1002/cav.70053","url":null,"abstract":"<div>\u0000 \u0000 <p>This study focuses on transforming real-world scenery into Chinese landscape painting masterpieces through style transfer. Traditional methods using convolutional neural networks (CNNs) and generative adversarial networks (GANs) often yield inconsistent patterns and artifacts. The rise of diffusion models (DMs) presents new opportunities for realistic image generation, but their inherent noise characteristics make it challenging to synthesize pure white or black images. Consequently, existing DM-based methods struggle to capture the unique style and color information of Chinese landscape paintings. To overcome these limitations, we propose CLPFusion, a novel framework that leverages pre-trained diffusion models for artistic style transfer. A key innovation is the Bidirectional State Space Models-CrossAttention (BiSSM-CA) module, which efficiently learns and retains the distinct styles of Chinese landscape paintings. Additionally, we introduce two latent space feature adjustment methods, Latent-AdaIN and Latent-WCT, to enhance style modulation during inference. Experiments demonstrate that CLPFusion produces more realistic and artistic Chinese landscape paintings than existing approaches, showcasing its effectiveness and uniqueness in the field.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Folding by Skinning 剥皮折叠

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-05 DOI: 10.1002/cav.70055

Chunyang Ma, Lifeng Zhu

We propose a novel method, entitled “Folding by Skinning”, which creatively integrates skinning techniques with folding simulations. This method allows users to specify a two-dimensional crease pattern along with the desired folding angles for each crease. Based on this input, the system computes the final three-dimensional shape of the fold. Rather than employing costly physics-based simulations, we explore the skinning method, noted for its effectiveness in handling the geometry of the folded shape. We recommend extracting the skinning weights directly from the user-defined crease patterns. By combining the obtained skinning weights with the user-input folding angles, the initial shape undergoes dual quaternion skinning to produce the folding result. Users can further optimize the shape using post-processing and targeted filtering of weights to generate more realistic results. Our experimental results demonstrate that “Folding by Skinning” yields high-quality outcomes and offers relatively fast computation, making it an effective tool for computer-aided design, animation, and fabrication applications.

我们提出了一种新的方法，称为“通过蒙皮折叠”，创造性地将蒙皮技术与折叠模拟相结合。该方法允许用户指定二维折痕图以及每个折痕所需的折叠角度。基于这些输入，系统计算出褶皱的最终三维形状。而不是采用昂贵的基于物理的模拟，我们探索蒙皮方法，注意到它在处理折叠形状的几何形状的有效性。我们建议直接从用户定义的折痕图中提取蒙皮权重。通过将获得的蒙皮权重与用户输入的折叠角度相结合，对初始形状进行对偶四元数蒙皮，从而产生折叠结果。用户可以使用后处理和有针对性的权重过滤来进一步优化形状，以产生更逼真的结果。我们的实验结果表明，“通过蒙皮折叠”产生了高质量的结果，并提供了相对快速的计算，使其成为计算机辅助设计，动画和制造应用的有效工具。

引用次数: 0

BACH: Bi-Stage Data-Driven Piano Performance Animation for Controllable Hand Motion 巴赫：双阶段数据驱动的钢琴表演动画可控的手运动

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-04 DOI: 10.1002/cav.70044

Jihui Jiao, Rui Zeng, Ju Dai, Junjun Pan

This paper presents a novel framework for generating piano performance animations using a two-stage deep learning model. By using discrete musical score data, the framework transforms sparse control signals into continuous, natural hand motions. Specifically, in the first stage, by incorporating musical temporal context, the keyframe predictor is leveraged to learn keyframe motion guidance. Meanwhile, the second stage synthesizes smooth transitions between these keyframes via an inter-frame sequence generator. Additionally, a Laplacian operator-based motion retargeting technique is introduced, ensuring that the generated animations can be adapted to different digital human models. We demonstrate the effectiveness of the system through an audiovisual multimedia application. Our approach provides an efficient, scalable method for generating realistic piano animations and holds promise for broader applications in animation tasks driven by sparse control signals.

本文提出了一个使用两阶段深度学习模型生成钢琴表演动画的新框架。通过使用离散的乐谱数据，该框架将稀疏的控制信号转换为连续的、自然的手部动作。具体来说，在第一阶段，通过结合音乐时间背景，关键帧预测器被用来学习关键帧运动指导。同时，第二阶段通过帧间序列生成器合成这些关键帧之间的平滑过渡。此外，引入了一种基于拉普拉斯算子的运动重定向技术，确保生成的动画可以适应不同的数字人体模型。通过一个视听多媒体应用，验证了该系统的有效性。我们的方法为生成逼真的钢琴动画提供了一种高效、可扩展的方法，并有望在稀疏控制信号驱动的动画任务中得到更广泛的应用。

引用次数: 0

Interaction With Virtual Objects Using Human Pose and Shape Estimation 使用人体姿态和形状估计与虚拟物体的交互

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-04 DOI: 10.1002/cav.70046

Hong Son Nguyen, DaEun Cheong, Andrew Chalmers, Myoung Gon Kim, Taehyun Rhee, JungHyun Han

In this article, we propose an AR system that facilitates a user's natural interaction with virtual objects in an augmented reality environment. The system consists of three modules: human pose and shape estimation, camera-space calibration, and physics simulation. The first module estimates a user's 3D pose and shape from a single RGB video stream, thereby reducing the system setup cost and broadening potential applications. The camera-space calibration module estimates the user's camera-space position to align the user with the input RGB image. The physics simulation enables seamless and physically natural interaction with virtual objects. Two prototyping applications built upon the system prove an enhancement in the quality of interaction, fostering a more immersive and intuitive user experience.

在本文中，我们提出了一个增强现实系统，促进用户在增强现实环境中与虚拟对象的自然交互。该系统由人体姿态和形状估计、相机空间标定和物理仿真三个模块组成。第一个模块从单个RGB视频流估计用户的3D姿势和形状，从而降低系统设置成本并扩大潜在的应用。相机空间校准模块估计用户的相机空间位置，使用户与输入的RGB图像对齐。物理模拟可以实现与虚拟对象的无缝和物理自然交互。基于该系统构建的两个原型应用程序证明了交互质量的提高，培养了更加身临其境和直观的用户体验。

引用次数: 0

Going Further With Vertex Block Descent 进一步使用顶点块下降

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-03 DOI: 10.1002/cav.70039

B. Saillant, F. Zara, F. Jaillet, G. Damiand

Vertex Block Descent (VBD) is a fast and robust method for the real-time simulation of deformable objects using the finite element method. Originally, the method was designed for the popular linear tetrahedral elements. However, these elements have low accuracy and introduce locking artifacts. In this context, we propose an extension of VBD to: (i) support any isoparametric elements and shape function orders; (ii) improve its performance even for high-order elements; and (iii) enhance convergence through the use of sub-stepping. Overall, using other types of elements enables more accurate results while maintaining a computational cost comparable to linear tetrahedra.

顶点块下降法（VBD）是一种快速、鲁棒的可变形物体有限元实时仿真方法。最初，该方法是针对流行的线性四面体单元设计的。然而，这些元素的精度较低，并引入了锁定工件。在这种情况下，我们提出了VBD的扩展：(i)支持任何等参元素和形状函数顺序；（ii）提高其性能，即使是高阶元件；(3)通过使用子步进来增强收敛性。总的来说，使用其他类型的元素可以获得更准确的结果，同时保持与线性四面体相当的计算成本。

引用次数: 0

SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt 基于Swin变压器和ConvNeXt的双支路强噪声图像去噪网络

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-03 DOI: 10.1002/cav.70030

Chuchao Lin, Changjun Zou, Hangbin Xu

Image denoising plays a vital role in restoring high-quality images from noisy inputs and directly impacts downstream vision tasks. Traditional methods often fail under strong noise, causing detail loss or excessive smoothing. While recent Convolutional Neural Networks-based and Transformer-based models have shown progress, they struggle to jointly capture global structure and preserve local details. To address this, we propose SCNet, a dual-branch fusion network tailored for strong-noise denoising. It combines a Swin Transformer branch for global context modeling and a ConvNeXt branch for fine-grained local feature extraction. Their outputs are adaptively merged via a Feature Fusion Block using joint spatial and channel attention, ensuring semantic consistency and texture fidelity. A multi-scale upsampling module and the Charbonnier loss further improve structural accuracy and visual quality. Extensive experiments on four benchmark datasets show that SCNet outperforms state-of-the-art methods, especially under severe noise, and proves effective in real-world tasks such as mural image restoration.

图像去噪在从噪声输入恢复高质量图像中起着至关重要的作用，并直接影响下游的视觉任务。传统的方法往往在强噪声下失效，造成细节丢失或过度平滑。虽然最近基于卷积神经网络和基于transformer的模型取得了进展，但它们很难同时捕获全局结构并保留局部细节。为了解决这个问题，我们提出了SCNet，一种专为强噪声去噪而设计的双分支融合网络。它结合了用于全局上下文建模的Swin Transformer分支和用于细粒度局部特征提取的ConvNeXt分支。它们的输出通过使用联合空间和通道关注的特征融合块自适应合并，确保语义一致性和纹理保真度。多尺度上采样模块和Charbonnier损失进一步提高了结构精度和视觉质量。在四个基准数据集上进行的大量实验表明，SCNet优于最先进的方法，特别是在严重噪声下，并且在壁画图像恢复等现实任务中被证明是有效的。

{"title":"SCNet: A Dual-Branch Network for Strong Noisy Image Denoising Based on Swin Transformer and ConvNeXt","authors":"Chuchao Lin, Changjun Zou, Hangbin Xu","doi":"10.1002/cav.70030","DOIUrl":"https://doi.org/10.1002/cav.70030","url":null,"abstract":"<div>\u0000 \u0000 <p>Image denoising plays a vital role in restoring high-quality images from noisy inputs and directly impacts downstream vision tasks. Traditional methods often fail under strong noise, causing detail loss or excessive smoothing. While recent Convolutional Neural Networks-based and Transformer-based models have shown progress, they struggle to jointly capture global structure and preserve local details. To address this, we propose SCNet, a dual-branch fusion network tailored for strong-noise denoising. It combines a Swin Transformer branch for global context modeling and a ConvNeXt branch for fine-grained local feature extraction. Their outputs are adaptively merged via a Feature Fusion Block using joint spatial and channel attention, ensuring semantic consistency and texture fidelity. A multi-scale upsampling module and the Charbonnier loss further improve structural accuracy and visual quality. Extensive experiments on four benchmark datasets show that SCNet outperforms state-of-the-art methods, especially under severe noise, and proves effective in real-world tasks such as mural image restoration.</p>\u0000 </div>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"36 3","pages":""},"PeriodicalIF":0.9,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144196987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AIKII: An AI-Enhanced Knowledge Interactive Interface for Knowledge Representation in Educational Games AIKII：用于教育游戏中知识表示的ai增强知识交互界面

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-02 DOI: 10.1002/cav.70052

Dake Liu, Huiwen Zhao, Wen Tang, Wenwen Yang

The use of generative AI to create responsive and adaptive game content has attracted considerable interest within the educational game design community, highlighting its potential as a tool for enhancing players' understanding of in-game knowledge. However, designing effective player-AI interaction to support knowledge representation remains unexplored. This paper presents AIKII, an AI-enhanced Knowledge Interaction Interface designed to facilitate knowledge representation in educational games. AIKII employs various interaction channels to represent in-game knowledge and support player engagement. To investigate its effectiveness and user learning experience, we implemented AIKII into The Journey of Poetry, an educational game centered on learning Chinese poetry, and conducted interviews with university students. The results demonstrated that our method fosters contextual and reflective connections between players and in-game knowledge, enhancing player autonomy and immersion.

使用生成式AI来创造响应性和适应性的游戏内容已经引起了教育游戏设计社区的极大兴趣，突出了它作为增强玩家对游戏知识理解的工具的潜力。然而，如何设计有效的玩家- ai交互来支持知识表示仍有待探索。本文介绍了AIKII，一个人工智能增强的知识交互界面，旨在促进教育游戏中的知识表示。AIKII采用多种互动渠道来呈现游戏知识并支持玩家粘性。为了调查AIKII的有效性和用户学习体验，我们将AIKII应用于以学习中国诗歌为中心的教育游戏《诗之旅》中，并对大学生进行了采访。结果表明，我们的方法促进了玩家与游戏知识之间的情境和反思联系，增强了玩家的自主性和沉浸感。

引用次数: 0

DTGS: Defocus-Tolerant View Synthesis Using Gaussian Splatting 使用高斯飞溅的容散焦视图合成

IF 0.9 4区计算机科学 Q4 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Computer Animation and Virtual Worlds

Pub Date : 2025-06-02 DOI: 10.1002/cav.70045

Xinying Dai, Li Yao

Defocus blur poses a significant challenge for 3D reconstruction, as traditional methods often struggle to maintain detail and accuracy in blurred regions. Building upon the recent advancements in the 3DGS technique, we propose an architecture for 3D scene reconstruction from defocused blurry images. Due to the sparsity of point clouds initialized by SfM, we improve the scene representation by reasonably filling in new Gaussians where the Gaussian field is insufficient. During the optimization phase, we adjust the gradient field based on the depth values of the Gaussians and introduce perceptual loss in the objective function to reduce reconstruction bias caused by blurriness and enhance the realism of the rendered results. Experimental results on both synthetic and real datasets show that our method outperforms existing approaches in terms of reconstruction quality and robustness, even under challenging defocus blur conditions.

散焦模糊对3D重建提出了重大挑战，因为传统的方法往往难以保持模糊区域的细节和准确性。基于3DGS技术的最新进展，我们提出了一种从散焦模糊图像中重建3D场景的架构。由于SfM初始化点云的稀疏性，我们通过在高斯场不足的地方合理填充新的高斯场来改善场景表示。在优化阶段，我们根据高斯函数的深度值对梯度场进行调整，并在目标函数中引入感知损失，以减少由于模糊造成的重建偏差，增强渲染结果的真实感。在合成数据集和真实数据集上的实验结果表明，即使在具有挑战性的散焦模糊条件下，我们的方法在重建质量和鲁棒性方面也优于现有方法。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Computer Animation and Virtual Worlds

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀