首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
Foreword to special section: Highlights from EuroVA 2024 特别部分前言:2024年欧锦赛亮点
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-10-01 DOI: 10.1016/j.cag.2025.104450
Hans-Jörg Schulz, Marco Angelini
{"title":"Foreword to special section: Highlights from EuroVA 2024","authors":"Hans-Jörg Schulz, Marco Angelini","doi":"10.1016/j.cag.2025.104450","DOIUrl":"10.1016/j.cag.2025.104450","url":null,"abstract":"","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104450"},"PeriodicalIF":2.8,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examining the attribution of gender and the perception of emotions in virtual humans 研究虚拟人类的性别归属和情感感知
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-29 DOI: 10.1016/j.cag.2025.104446
Victor Flávio de Andrade Araujo , Angelo Brandelli Costa , Soraia Raupp Musse
Virtual Humans (VHs) are becoming increasingly realistic, raising questions about how users perceive their gender and emotions. In this study, we investigate how textually assigned gender and visual facial features influence both gender attribution and emotion recognition in VHs. Two experiments were conducted. In the first, participants evaluated a nonbinary VH animated with expressions performed by both male and female actors. In the second part, participants assessed binary male and female VHs animated by either real actors or data-driven facial styles. Results show that users often rely on textual gender cues and facial features to assign gender to VHs. Emotion recognition was more accurate when expressions were performed by actresses or derived from facial styles, particularly in nonbinary models. Notably, participants more consistently attributed gender according to textual cues when the VH was visually androgynous, suggesting that the absence of strong gendered facial markers increases the reliance on textual information. These findings offer insights for designing more inclusive and perceptually coherent virtual agents.
虚拟人(VHs)正变得越来越逼真,这引发了用户如何感知自己的性别和情感的问题。在本研究中,我们探讨了文本分配的性别和视觉面部特征对视频性别归因和情绪识别的影响。进行了两个实验。在第一个实验中,参与者评估了一个由男女演员共同表演的非二元动画VH。在第二部分中,参与者评估由真人演员或数据驱动的面部风格制作的二元男性和女性VHs。结果表明,用户通常依靠文本性别线索和面部特征来确定录像带的性别。当表情由女演员表演或来自面部风格时,尤其是在非二元模型中,情感识别更准确。值得注意的是,当VH在视觉上是雌雄同体时,参与者更一致地根据文本线索来判断性别,这表明缺乏强烈的性别面部标记增加了对文本信息的依赖。这些发现为设计更具包容性和感知一致性的虚拟代理提供了见解。
{"title":"Examining the attribution of gender and the perception of emotions in virtual humans","authors":"Victor Flávio de Andrade Araujo ,&nbsp;Angelo Brandelli Costa ,&nbsp;Soraia Raupp Musse","doi":"10.1016/j.cag.2025.104446","DOIUrl":"10.1016/j.cag.2025.104446","url":null,"abstract":"<div><div>Virtual Humans (VHs) are becoming increasingly realistic, raising questions about how users perceive their gender and emotions. In this study, we investigate how textually assigned gender and visual facial features influence both gender attribution and emotion recognition in VHs. Two experiments were conducted. In the first, participants evaluated a nonbinary VH animated with expressions performed by both male and female actors. In the second part, participants assessed binary male and female VHs animated by either real actors or data-driven facial styles. Results show that users often rely on textual gender cues and facial features to assign gender to VHs. Emotion recognition was more accurate when expressions were performed by actresses or derived from facial styles, particularly in nonbinary models. Notably, participants more consistently attributed gender according to textual cues when the VH was visually androgynous, suggesting that the absence of strong gendered facial markers increases the reliance on textual information. These findings offer insights for designing more inclusive and perceptually coherent virtual agents.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104446"},"PeriodicalIF":2.8,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CeRF: Convolutional neural radiance derivative fields for new view synthesis CeRF:用于新视图合成的卷积神经辐射导数场
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-27 DOI: 10.1016/j.cag.2025.104447
Wenjie Liu, Ling You, Xiaoyan Yang, Dingbo Lu, Yang Li, Changbo Wang
Recently, Neural Radiance Fields (NeRF) has seen a surge in popularity, driven by its ability to generate high-fidelity novel view synthesized images. However, unexpected “floating ghost” artifacts usually emerge with limited training views and intricate optical phenomena. This issue stems from the inherent ambiguities in radiance fields, rooted in the fundamental volume rendering equation and the unrestricted learning paradigms in multi-layer perceptrons. In this paper, we introduce Convolutional Neural Radiance Fields (CeRF), a novel approach to model the derivatives of radiance along rays and solve the ambiguities through a fully neural rendering pipeline. To this end, a single-surface selection mechanism involving both a modified softmax function and an ideal point is proposed to implement our radiance derivative fields. Furthermore, a structured neural network architecture with 1D convolutional operations is employed to further boost the performance by extracting latent ray representations. Extensive experiments demonstrate the promising results of our proposed model compared with existing state-of-the-art approaches.
最近,神经辐射场(NeRF)因其能够生成高保真的新视图合成图像而受到广泛欢迎。然而,意想不到的“漂浮幽灵”人工制品通常出现在有限的训练视图和复杂的光学现象。这一问题源于辐射场固有的模糊性,根植于基本的体渲染方程和多层感知器的无限制学习范式。在本文中,我们介绍了卷积神经辐射场(CeRF),这是一种新的方法来模拟沿光线的辐射导数,并通过全神经渲染管道解决歧义。为此,提出了一种包含改进softmax函数和理想点的单曲面选择机制来实现我们的辐射导数场。此外,采用一维卷积操作的结构化神经网络架构,通过提取潜在射线表示进一步提高性能。与现有的最先进的方法相比,大量的实验证明了我们提出的模型的有希望的结果。
{"title":"CeRF: Convolutional neural radiance derivative fields for new view synthesis","authors":"Wenjie Liu,&nbsp;Ling You,&nbsp;Xiaoyan Yang,&nbsp;Dingbo Lu,&nbsp;Yang Li,&nbsp;Changbo Wang","doi":"10.1016/j.cag.2025.104447","DOIUrl":"10.1016/j.cag.2025.104447","url":null,"abstract":"<div><div>Recently, Neural Radiance Fields (NeRF) has seen a surge in popularity, driven by its ability to generate high-fidelity novel view synthesized images. However, unexpected “floating ghost” artifacts usually emerge with limited training views and intricate optical phenomena. This issue stems from the inherent ambiguities in radiance fields, rooted in the fundamental volume rendering equation and the unrestricted learning paradigms in multi-layer perceptrons. In this paper, we introduce Convolutional Neural Radiance Fields (CeRF), a novel approach to model the derivatives of radiance along rays and solve the ambiguities through a fully neural rendering pipeline. To this end, a single-surface selection mechanism involving both a modified softmax function and an ideal point is proposed to implement our radiance derivative fields. Furthermore, a structured neural network architecture with 1D convolutional operations is employed to further boost the performance by extracting latent ray representations. Extensive experiments demonstrate the promising results of our proposed model compared with existing state-of-the-art approaches.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104447"},"PeriodicalIF":2.8,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing agricultural remote sensing: A comprehensive review of deep supervised and Self-Supervised Learning for crop monitoring 推进农业遥感:作物监测的深度监督和自监督学习综述
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-26 DOI: 10.1016/j.cag.2025.104434
Mateus Pinto da Silva , Sabrina P.L.P. Correa , Mariana A.R. Schaefer , Julio C.S. Reis , Ian M. Nunes , Jefersson A. dos Santos , Hugo N. Oliveira
Deep Learning based on Remote Sensing has become a powerful tool to increase agricultural productivity, mitigate the effects of climate change, and monitor deforestation. However, there is a lack of standardization and appropriate taxonomic classification of the literature available in the context of informatics. Taking advantage of the categories already available in the literature, this paper provides an overview of the relevant literature categorized into five main applications: Parcel Segmentation, Crop Mapping, Crop Yielding, Land Use and Land Cover, and Change Detection. We review notable trends, including the transition from traditional to deep learning, convolutional models, recurrent and attention-based models, and generative strategies. We also map the use of Self-Supervised Learning through contrastive, non-contrastive, data masking and hybrid semi-supervised pretraining for the aforementioned applications with an experimental benchmark for Post-Harvest Crop Mapping models, and present our solution, SITS-Siam, which achieves top performance on two of the three datasets tested. In addition, we provide a comprehensive overview of publicly available datasets for these applications and also unlabeled datasets for Remote Sensing in general. We hope that our work can be useful as a guide for future work in this context. The benchmark code and the pre-trained weights are available in https://github.com/mateuspinto/rs-agriculture-survey-extended.
基于遥感的深度学习已成为提高农业生产力、减轻气候变化影响和监测森林砍伐的有力工具。然而,在信息学的背景下,文献缺乏标准化和适当的分类分类。利用现有文献的分类,本文概述了相关文献的五个主要应用:地块分割、作物制图、作物产量、土地利用和土地覆盖以及变化检测。我们回顾了一些值得注意的趋势,包括从传统学习到深度学习的转变,卷积模型,循环和基于注意力的模型,以及生成策略。我们还通过对比、非对比、数据屏蔽和混合半监督预训练对上述应用程序进行了映射,并对收获后作物映射模型进行了实验基准,并提出了我们的解决方案SITS-Siam,该解决方案在测试的三个数据集中的两个上达到了最佳性能。此外,我们还提供了这些应用的公开可用数据集的全面概述,以及一般遥感的未标记数据集。我们希望我们的工作能够对今后在这方面的工作起到指导作用。基准代码和预训练的权重可在https://github.com/mateuspinto/rs-agriculture-survey-extended中获得。
{"title":"Advancing agricultural remote sensing: A comprehensive review of deep supervised and Self-Supervised Learning for crop monitoring","authors":"Mateus Pinto da Silva ,&nbsp;Sabrina P.L.P. Correa ,&nbsp;Mariana A.R. Schaefer ,&nbsp;Julio C.S. Reis ,&nbsp;Ian M. Nunes ,&nbsp;Jefersson A. dos Santos ,&nbsp;Hugo N. Oliveira","doi":"10.1016/j.cag.2025.104434","DOIUrl":"10.1016/j.cag.2025.104434","url":null,"abstract":"<div><div>Deep Learning based on Remote Sensing has become a powerful tool to increase agricultural productivity, mitigate the effects of climate change, and monitor deforestation. However, there is a lack of standardization and appropriate taxonomic classification of the literature available in the context of informatics. Taking advantage of the categories already available in the literature, this paper provides an overview of the relevant literature categorized into five main applications: Parcel Segmentation, Crop Mapping, Crop Yielding, Land Use and Land Cover, and Change Detection. We review notable trends, including the transition from traditional to deep learning, convolutional models, recurrent and attention-based models, and generative strategies. We also map the use of Self-Supervised Learning through contrastive, non-contrastive, data masking and hybrid semi-supervised pretraining for the aforementioned applications with an experimental benchmark for Post-Harvest Crop Mapping models, and present our solution, SITS-Siam, which achieves top performance on two of the three datasets tested. In addition, we provide a comprehensive overview of publicly available datasets for these applications and also unlabeled datasets for Remote Sensing in general. We hope that our work can be useful as a guide for future work in this context. The benchmark code and the pre-trained weights are available in <span><span>https://github.com/mateuspinto/rs-agriculture-survey-extended</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104434"},"PeriodicalIF":2.8,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diffusion model-based size variable virtual try-on technology and evaluation method 基于扩散模型的尺寸可变虚拟试戴技术及评价方法
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-25 DOI: 10.1016/j.cag.2025.104448
Shufang Zhang , Hang Qian , Minxue Ni , Yaxuan Li , Wenxin Ding , Jun Liu
With the rapid development of electronic commerce, virtual try-on technology has become an essential tool to satisfy consumers’ personalised clothing preferences. Diffusion-based virtual try-on systems aim to naturally align garments with target individuals, generating realistic and detailed try-on images. However, existing methods overlook the importance of garment size variations in meeting personalised consumer needs. To address this, we propose a novel virtual try-on method named SV-VTON, which introduces garment sizing concepts into virtual try-on tasks. The SV-VTON method first generates refined masks for multiple garment sizes, then integrates these masks with garment images at varying proportions, enabling virtual try-on simulations across different sizes. In addition, we develop a specialised size evaluation module to quantitatively assess the accuracy of size variations. This module calculates differences between generated size increments and international sizing standards, providing objective measurements of size accuracy. To further validate SV-VTON’s generalisation capability across different models, we conduct experiments on multiple SOTA Diffusion models. The results demonstrate that SV-VTON consistently achieves precise multi-size virtual try-on across various SOTA models, and validates the effectiveness and rationality of the proposed method, significantly fulfilling users’ personalised multi-size virtual try-on requirements.
随着电子商务的快速发展,虚拟试穿技术已经成为满足消费者个性化服装偏好的重要工具。基于扩散的虚拟试穿系统旨在自然地将服装与目标个体对齐,生成逼真而详细的试穿图像。然而,现有的方法忽略了服装尺寸变化在满足个性化消费者需求方面的重要性。为了解决这个问题,我们提出了一种新的虚拟试衣方法SV-VTON,它将服装尺寸的概念引入虚拟试衣任务中。SV-VTON方法首先为多种服装尺寸生成精致的口罩,然后将这些口罩与不同比例的服装图像集成在一起,实现不同尺寸的虚拟试穿模拟。此外,我们还开发了专门的尺寸评估模块,以定量评估尺寸变化的准确性。该模块计算生成的尺寸增量与国际尺寸标准之间的差异,提供尺寸精度的客观测量。为了进一步验证SV-VTON在不同模型中的泛化能力,我们在多个SOTA扩散模型上进行了实验。结果表明,SV-VTON在各种SOTA模型中均能实现精确的多码虚拟试戴,验证了所提方法的有效性和合理性,显著满足了用户个性化的多码虚拟试戴需求。
{"title":"Diffusion model-based size variable virtual try-on technology and evaluation method","authors":"Shufang Zhang ,&nbsp;Hang Qian ,&nbsp;Minxue Ni ,&nbsp;Yaxuan Li ,&nbsp;Wenxin Ding ,&nbsp;Jun Liu","doi":"10.1016/j.cag.2025.104448","DOIUrl":"10.1016/j.cag.2025.104448","url":null,"abstract":"<div><div>With the rapid development of electronic commerce, virtual try-on technology has become an essential tool to satisfy consumers’ personalised clothing preferences. Diffusion-based virtual try-on systems aim to naturally align garments with target individuals, generating realistic and detailed try-on images. However, existing methods overlook the importance of garment size variations in meeting personalised consumer needs. To address this, we propose a novel virtual try-on method named SV-VTON, which introduces garment sizing concepts into virtual try-on tasks. The SV-VTON method first generates refined masks for multiple garment sizes, then integrates these masks with garment images at varying proportions, enabling virtual try-on simulations across different sizes. In addition, we develop a specialised size evaluation module to quantitatively assess the accuracy of size variations. This module calculates differences between generated size increments and international sizing standards, providing objective measurements of size accuracy. To further validate SV-VTON’s generalisation capability across different models, we conduct experiments on multiple SOTA Diffusion models. The results demonstrate that SV-VTON consistently achieves precise multi-size virtual try-on across various SOTA models, and validates the effectiveness and rationality of the proposed method, significantly fulfilling users’ personalised multi-size virtual try-on requirements.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104448"},"PeriodicalIF":2.8,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The vividness of mental imagery in virtual reality: A study on multisensory experiences in virtual tourism 虚拟现实中心理意象的生动性:虚拟旅游中的多感官体验研究
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-25 DOI: 10.1016/j.cag.2025.104443
Mariana Magalhães , Miguel Melo , António Coelho , Maximino Bessa
This paper aims to evaluate how different combinations of multisensory stimuli affect the vividness of users’ mental imagery in the context of virtual tourism. To this end, a between-subjects experimental study was conducted with 94 participants, who were allocated to either a positive or a negative immersive virtual environment. The positive environment contained only pleasant multisensory stimuli, whereas the negative contained only unpleasant stimuli. For each of the virtual experiences, a multisensory treasure hunt was developed, where each object found corresponded to a planned combination of stimuli (positive or negative, accordingly). The results showed that positive stimuli involving a higher number of sensory modalities resulted in higher reported vividness. In contrast, when the same multisensory modalities were delivered with negative stimuli, vividness levels decreased — an effect we attribute to potential cognitive overload. Nevertheless, some reduced negative combinations (audiovisual with smell and audiovisual with haptics) remained effective, indicating that olfactory and haptic cues play an important role in shaping users’ vividness of mental imagery, even in negative contexts.
本文旨在评估在虚拟旅游背景下,多感官刺激的不同组合如何影响用户心理意象的生动性。为此,对94名参与者进行了一项受试者间实验研究,他们被分配到积极或消极的沉浸式虚拟环境中。积极环境只包含愉快的多感官刺激,而消极环境只包含不愉快的刺激。对于每一种虚拟体验,都开发了一种多感官寻宝游戏,其中发现的每个对象都对应于计划中的刺激组合(相应地是积极的或消极的)。结果表明,积极刺激涉及更多的感觉模式导致更高的报告生动度。相比之下,当同样的多感官模式与负面刺激一起传递时,生动程度下降——我们将这种效应归因于潜在的认知超载。然而,一些减少的负面组合(视听与嗅觉和视听与触觉)仍然有效,这表明嗅觉和触觉线索在塑造用户的心理意象的生动性方面起着重要作用,即使在负面环境中也是如此。
{"title":"The vividness of mental imagery in virtual reality: A study on multisensory experiences in virtual tourism","authors":"Mariana Magalhães ,&nbsp;Miguel Melo ,&nbsp;António Coelho ,&nbsp;Maximino Bessa","doi":"10.1016/j.cag.2025.104443","DOIUrl":"10.1016/j.cag.2025.104443","url":null,"abstract":"<div><div>This paper aims to evaluate how different combinations of multisensory stimuli affect the vividness of users’ mental imagery in the context of virtual tourism. To this end, a between-subjects experimental study was conducted with 94 participants, who were allocated to either a positive or a negative immersive virtual environment. The positive environment contained only pleasant multisensory stimuli, whereas the negative contained only unpleasant stimuli. For each of the virtual experiences, a multisensory treasure hunt was developed, where each object found corresponded to a planned combination of stimuli (positive or negative, accordingly). The results showed that positive stimuli involving a higher number of sensory modalities resulted in higher reported vividness. In contrast, when the same multisensory modalities were delivered with negative stimuli, vividness levels decreased — an effect we attribute to potential cognitive overload. Nevertheless, some reduced negative combinations (audiovisual with smell and audiovisual with haptics) remained effective, indicating that olfactory and haptic cues play an important role in shaping users’ vividness of mental imagery, even in negative contexts.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104443"},"PeriodicalIF":2.8,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusing multi-stage clicks with deep feedback aggregation for interactive image segmentation 融合多阶段点击与深度反馈聚合的交互式图像分割
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-24 DOI: 10.1016/j.cag.2025.104445
Jianwu Long, Yuanqin Liu, Shaoyi Wang, Shuang Chen, Qi Luo
The objective of interactive image segmentation is to generate a segmentation mask for the target object using minimal user interaction. During the interaction process, segmentation results from previous iterations are typically used as feedback to guide subsequent user input. However, existing approaches often concatenate user interactions, feedback, and low-level image features as direct inputs to the network, overlooking the high-level semantic information contained in the feedback and the issue of information dilution from click signals. To address these limitations, we propose a novel interactive image segmentation model called Multi-stage Click Fusion with deep Feedback Aggregation(MCFA). MCFA introduces a new information fusion strategy. Specifically, for feedback information, it refines previous-round feedback using deep features and integrates the optimized feedback into the feature representation. For user clicks, MCFA performs multi-stage fusion to enhance click propagation while constraining its direction through the refined feedback. Experimental results demonstrate that MCFA consistently outperforms existing methods across five benchmark datasets: GrabCut, Berkeley, SBD, DAVIS and CVC-ClinicDB.
交互式图像分割的目的是使用最少的用户交互为目标对象生成分割掩码。在交互过程中,以前迭代的分割结果通常用作指导后续用户输入的反馈。然而,现有的方法通常将用户交互、反馈和低级图像特征连接起来作为网络的直接输入,忽略了反馈中包含的高级语义信息和点击信号的信息稀释问题。为了解决这些限制,我们提出了一种新的交互式图像分割模型,称为深度反馈聚合的多阶段点击融合(Multi-stage Click Fusion with deep Feedback Aggregation, MCFA)。MCFA引入了一种新的信息融合策略。具体而言,对于反馈信息,它使用深度特征对前一轮反馈进行细化,并将优化后的反馈集成到特征表示中。对于用户点击,MCFA进行多阶段融合,增强点击传播,同时通过精细反馈约束点击传播方向。实验结果表明,MCFA在五个基准数据集(GrabCut、Berkeley、SBD、DAVIS和CVC-ClinicDB)上始终优于现有方法。
{"title":"Fusing multi-stage clicks with deep feedback aggregation for interactive image segmentation","authors":"Jianwu Long,&nbsp;Yuanqin Liu,&nbsp;Shaoyi Wang,&nbsp;Shuang Chen,&nbsp;Qi Luo","doi":"10.1016/j.cag.2025.104445","DOIUrl":"10.1016/j.cag.2025.104445","url":null,"abstract":"<div><div>The objective of interactive image segmentation is to generate a segmentation mask for the target object using minimal user interaction. During the interaction process, segmentation results from previous iterations are typically used as feedback to guide subsequent user input. However, existing approaches often concatenate user interactions, feedback, and low-level image features as direct inputs to the network, overlooking the high-level semantic information contained in the feedback and the issue of information dilution from click signals. To address these limitations, we propose a novel interactive image segmentation model called Multi-stage Click Fusion with deep Feedback Aggregation(MCFA). MCFA introduces a new information fusion strategy. Specifically, for feedback information, it refines previous-round feedback using deep features and integrates the optimized feedback into the feature representation. For user clicks, MCFA performs multi-stage fusion to enhance click propagation while constraining its direction through the refined feedback. Experimental results demonstrate that MCFA consistently outperforms existing methods across five benchmark datasets: GrabCut, Berkeley, SBD, DAVIS and CVC-ClinicDB.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104445"},"PeriodicalIF":2.8,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145160121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HR-2DGS: Hybrid regularization for sparse-view 3D reconstruction with 2D Gaussian splatting HR-2DGS:基于二维高斯溅射的稀疏视图三维重建的混合正则化
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-23 DOI: 10.1016/j.cag.2025.104444
Yong Tang, Jiawen Yan, Yu Li, Yu Liang, Feng Wang, Jing Zhao
Sparse-view 3D reconstruction has garnered widespread attention due to its demand for high-quality reconstruction under low-sampling data conditions. Existing NeRF-based methods rely on dense views and substantial computational resources, while 3DGS is limited by multi-view inconsistency and insufficient geometric detail recovery, making it challenging to achieve ideal results in sparse-view scenarios. This paper introduces HR-2DGS, a novel hybrid regularization framework based on 2D Gaussian Splatting (2DGS), which significantly enhances multi-view consistency and geometric recovery by dynamically fusing monocular depth estimates with rendered depth maps, incorporating hybrid normal regularization techniques. To further refine local details, we introduce a per-pixel depth normalization that leverages each pixel’s neighborhood statistics to emphasize fine-scale geometric variations. Experimental results on the LLFF and DTU datasets demonstrate that HR-2DGS outperforms existing methods in terms of PSNR, SSIM, and LPIPS, while requiring only 2.5GB of memory and a few minutes of training time for efficient training and real-time rendering.
稀疏视图三维重建由于需要在低采样数据条件下进行高质量的重建而受到广泛关注。现有的基于nerf的方法依赖于密集视图和大量的计算资源,而3DGS受限于多视图不一致和几何细节恢复不足,难以在稀疏视图场景下获得理想的结果。本文介绍了一种新的基于二维高斯飞溅(2DGS)的混合正则化框架HR-2DGS,该框架通过将单眼深度估计与渲染深度图动态融合,结合混合正态正则化技术,显著提高了多视图一致性和几何恢复能力。为了进一步细化局部细节,我们引入了逐像素深度归一化,利用每个像素的邻域统计来强调精细尺度的几何变化。在LLFF和DTU数据集上的实验结果表明,HR-2DGS在PSNR、SSIM和LPIPS方面优于现有方法,而仅需2.5GB内存和几分钟的训练时间即可实现高效的训练和实时渲染。
{"title":"HR-2DGS: Hybrid regularization for sparse-view 3D reconstruction with 2D Gaussian splatting","authors":"Yong Tang,&nbsp;Jiawen Yan,&nbsp;Yu Li,&nbsp;Yu Liang,&nbsp;Feng Wang,&nbsp;Jing Zhao","doi":"10.1016/j.cag.2025.104444","DOIUrl":"10.1016/j.cag.2025.104444","url":null,"abstract":"<div><div>Sparse-view 3D reconstruction has garnered widespread attention due to its demand for high-quality reconstruction under low-sampling data conditions. Existing NeRF-based methods rely on dense views and substantial computational resources, while 3DGS is limited by multi-view inconsistency and insufficient geometric detail recovery, making it challenging to achieve ideal results in sparse-view scenarios. This paper introduces HR-2DGS, a novel hybrid regularization framework based on 2D Gaussian Splatting (2DGS), which significantly enhances multi-view consistency and geometric recovery by dynamically fusing monocular depth estimates with rendered depth maps, incorporating hybrid normal regularization techniques. To further refine local details, we introduce a per-pixel depth normalization that leverages each pixel’s neighborhood statistics to emphasize fine-scale geometric variations. Experimental results on the LLFF and DTU datasets demonstrate that HR-2DGS outperforms existing methods in terms of PSNR, SSIM, and LPIPS, while requiring only 2.5GB of memory and a few minutes of training time for efficient training and real-time rendering.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104444"},"PeriodicalIF":2.8,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145160123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepSES: Learning solvent-excluded surfaces via neural signed distance fields DeepSES:通过神经符号距离场学习溶剂排除表面
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-23 DOI: 10.1016/j.cag.2025.104392
Niklas Merk, Anna Sterzik, Kai Lawonn
The solvent-excluded surface (SES) is essential for revealing molecular shape and solvent accessibility in applications such as molecular modeling, drug discovery, and protein folding. Its signed distance field (SDF) delivers a continuous, differentiable surface representation that enables efficient rendering, analysis, and interaction in volumetric visualization frameworks. However, analytic methods that compute the SDF of the SES cannot run at interactive rates on large biomolecular complexes, and grid-based methods tend to result in significant approximation errors, depending on molecular size and grid resolution. We address these limitations with DeepSES, a neural inference pipeline that predicts the SES SDF directly from the computationally simpler van der Waals (vdW) SDF on a fixed high-resolution grid. By employing an adaptive volume-filtering scheme that directs processing only to visible regions near the molecular surface, DeepSES yields interactive frame rates irrespective of molecule size. By offering multiple network configurations, DeepSES enables practitioners to balance inference time against prediction accuracy. In benchmarks on molecules ranging from one thousand to nearly four million atoms, our fastest configuration achieves real-time frame rates with a sub-angstrom mean error, while our highest-accuracy variant sustains interactive performance and outperforms state-of-the-art methods in terms of surface quality. By replacing costly algorithmic solvers with selective neural prediction, DeepSES provides a scalable, high-resolution solution for interactive biomolecular visualization.
溶剂排除表面(SES)在分子建模、药物发现和蛋白质折叠等应用中对于揭示分子形状和溶剂可及性至关重要。它的符号距离域(SDF)提供了一个连续的、可微的表面表示,在体积可视化框架中实现了高效的渲染、分析和交互。然而,计算SES的SDF的分析方法不能在大型生物分子复合物上以交互速率运行,并且基于网格的方法往往会导致显着的近似误差,这取决于分子大小和网格分辨率。我们使用DeepSES解决了这些限制,DeepSES是一种神经推理管道,可以直接从固定高分辨率网格上计算更简单的范德瓦尔斯(vdW) SDF预测SES SDF。通过采用自适应体积滤波方案,只对分子表面附近的可见区域进行处理,DeepSES产生的交互帧率与分子大小无关。通过提供多种网络配置,DeepSES使从业者能够平衡推理时间和预测准确性。在从1000到近400万个原子的分子基准测试中,我们最快的配置实现了亚埃平均误差的实时帧率,而我们最高精度的变体保持了交互性能,并在表面质量方面优于最先进的方法。通过用选择性神经预测取代昂贵的算法求解器,DeepSES为交互式生物分子可视化提供了可扩展的高分辨率解决方案。
{"title":"DeepSES: Learning solvent-excluded surfaces via neural signed distance fields","authors":"Niklas Merk,&nbsp;Anna Sterzik,&nbsp;Kai Lawonn","doi":"10.1016/j.cag.2025.104392","DOIUrl":"10.1016/j.cag.2025.104392","url":null,"abstract":"<div><div>The solvent-excluded surface (SES) is essential for revealing molecular shape and solvent accessibility in applications such as molecular modeling, drug discovery, and protein folding. Its signed distance field (SDF) delivers a continuous, differentiable surface representation that enables efficient rendering, analysis, and interaction in volumetric visualization frameworks. However, analytic methods that compute the SDF of the SES cannot run at interactive rates on large biomolecular complexes, and grid-based methods tend to result in significant approximation errors, depending on molecular size and grid resolution. We address these limitations with DeepSES, a neural inference pipeline that predicts the SES SDF directly from the computationally simpler van der Waals (vdW) SDF on a fixed high-resolution grid. By employing an adaptive volume-filtering scheme that directs processing only to visible regions near the molecular surface, DeepSES yields interactive frame rates irrespective of molecule size. By offering multiple network configurations, DeepSES enables practitioners to balance inference time against prediction accuracy. In benchmarks on molecules ranging from one thousand to nearly four million atoms, our fastest configuration achieves real-time frame rates with a sub-angstrom mean error, while our highest-accuracy variant sustains interactive performance and outperforms state-of-the-art methods in terms of surface quality. By replacing costly algorithmic solvers with selective neural prediction, DeepSES provides a scalable, high-resolution solution for interactive biomolecular visualization.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104392"},"PeriodicalIF":2.8,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145160124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Narrowing-Cascade splines for control nets that shed mesh lines 用于控制网的窄级联样条
IF 2.8 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-09-22 DOI: 10.1016/j.cag.2025.104441
Serhat Cam , Erkan Gunpinar , Kȩstutis Karčiauskas , Jörg Peters
Quad-dominant meshes are popular with animation designers and can efficiently be generated from point clouds. To join primary regions, quad-dominant meshes include non-4-valent vertices and non-quad regions. To transition between regions of rich detail and simple shape, quad-dominant meshes commonly use a cascade of n1 triangles that reduce the number of parallel quad strips from n+1 to 2. For these cascades, the Narrowing-Cascade spline, short NCn, provides a new shape-optimized G1 spline surface. NCn can treat cascade meshes as B-spline-like control nets. For n>3, as opposed to n=2,3, cascades have interior points that both guide and complicate the construction of the output tensor-product NCspline. The NCn spline follows the input mesh, including interior points, and delivers a high-quality curved surface of low degree.
四主导网格很受动画设计师的欢迎,可以有效地从点云生成。为了连接初级区域,四主导网格包括非四价顶点和非四元区域。为了在丰富细节和简单形状的区域之间转换,四主导网格通常使用n−1个三角形的级联,将平行四条带的数量从n+1减少到2。对于这些级联,窄级联样条(简称NCn)提供了一个新的形状优化的G1样条表面。神经网络可以将级联网格视为类b样条控制网。对于n>;3,与n=2,3相反,级联具有内部点,这些点既指导又使输出张量积NCspline的构造复杂化。NCn样条遵循输入网格,包括内部点,并提供低度的高质量曲面。
{"title":"Narrowing-Cascade splines for control nets that shed mesh lines","authors":"Serhat Cam ,&nbsp;Erkan Gunpinar ,&nbsp;Kȩstutis Karčiauskas ,&nbsp;Jörg Peters","doi":"10.1016/j.cag.2025.104441","DOIUrl":"10.1016/j.cag.2025.104441","url":null,"abstract":"<div><div>Quad-dominant meshes are popular with animation designers and can efficiently be generated from point clouds. To join primary regions, quad-dominant meshes include non-4-valent vertices and non-quad regions. To transition between regions of rich detail and simple shape, quad-dominant meshes commonly use a cascade of <span><math><mrow><mi>n</mi><mo>−</mo><mn>1</mn></mrow></math></span> triangles that reduce the number of parallel quad strips from <span><math><mrow><mi>n</mi><mo>+</mo><mn>1</mn></mrow></math></span> to 2. For these cascades, the Narrowing-Cascade spline, short NC<span><math><msup><mrow></mrow><mrow><mi>n</mi></mrow></msup></math></span>, provides a new shape-optimized <span><math><msup><mrow><mi>G</mi></mrow><mrow><mn>1</mn></mrow></msup></math></span> spline surface. NC<span><math><msup><mrow></mrow><mrow><mi>n</mi></mrow></msup></math></span> can treat cascade meshes as B-spline-like control nets. For <span><math><mrow><mi>n</mi><mo>&gt;</mo><mn>3</mn></mrow></math></span>, as opposed to <span><math><mrow><mi>n</mi><mo>=</mo><mn>2</mn><mo>,</mo><mn>3</mn></mrow></math></span>, cascades have interior points that both guide and complicate the construction of the output tensor-product NC<span><math><msup><mrow></mrow><mrow><mspace></mspace></mrow></msup></math></span>spline. The NC<span><math><msup><mrow></mrow><mrow><mi>n</mi></mrow></msup></math></span> spline follows the input mesh, including interior points, and delivers a high-quality curved surface of low degree.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"133 ","pages":"Article 104441"},"PeriodicalIF":2.8,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1