首页 > 最新文献

Computers & Graphics-Uk最新文献

英文 中文
GEAST-RF: Geometry Enhanced 3D Arbitrary Style Transfer Via Neural Radiance Fields
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-16 DOI: 10.1016/j.cag.2025.104181
Dong He , Wenhua Qian , Jinde Cao
Style transfer techniques integrated with neural radiance fields enhance the stylization effect of the 3D scene. The objective of 3D style transfer is to render novel views of stylized 3D scenes while maintaining multi-view consistency. However, the current state of 3D style transfer confronts three principal challenges: precise geometric reconstruction, style bias issues, and the artifacts and floaters that frequently emerge during the stylization process. To address these issues, we propose GEAST-RF (Geometry Enhanced 3D Arbitrary Style Transfer Via Neural Radiance Fields), which employs explicit high-level feature grids to represent 3D scenes, achieving detailed geometry reconstruction through volume rendering and high-quality 3D arbitrary style transfer based on target style image information. Specifically, GEAST-RF introduces two pivotal innovations to enhance 3D stylization. The first is the geometry enhancements module, which aligns the geometric structures of stylized views from the same viewpoint to those in the content views, enabling high-precision geometry reconstruction. Thresholding and masking operations are introduced during alignment to alleviate artifacts such as floaters produced during rendering. The second is the adaptive stylization module, which utilizes adaptive computation during the stylization stage to make the model focus more on core style information, reducing reliance on edge style information. Our experiments demonstrate that GEAST-RF can achieve precise geometric structures while providing exceptional 3D stylization effects. A user survey further corroborates these experimental results, revealing that the majority of participants prefer our generated outputs compared to the most recent state-of-the-art methods.
{"title":"GEAST-RF: Geometry Enhanced 3D Arbitrary Style Transfer Via Neural Radiance Fields","authors":"Dong He ,&nbsp;Wenhua Qian ,&nbsp;Jinde Cao","doi":"10.1016/j.cag.2025.104181","DOIUrl":"10.1016/j.cag.2025.104181","url":null,"abstract":"<div><div>Style transfer techniques integrated with neural radiance fields enhance the stylization effect of the 3D scene. The objective of 3D style transfer is to render novel views of stylized 3D scenes while maintaining multi-view consistency. However, the current state of 3D style transfer confronts three principal challenges: precise geometric reconstruction, style bias issues, and the artifacts and floaters that frequently emerge during the stylization process. To address these issues, we propose GEAST-RF (Geometry Enhanced 3D Arbitrary Style Transfer Via Neural Radiance Fields), which employs explicit high-level feature grids to represent 3D scenes, achieving detailed geometry reconstruction through volume rendering and high-quality 3D arbitrary style transfer based on target style image information. Specifically, GEAST-RF introduces two pivotal innovations to enhance 3D stylization. The first is the geometry enhancements module, which aligns the geometric structures of stylized views from the same viewpoint to those in the content views, enabling high-precision geometry reconstruction. Thresholding and masking operations are introduced during alignment to alleviate artifacts such as floaters produced during rendering. The second is the adaptive stylization module, which utilizes adaptive computation during the stylization stage to make the model focus more on core style information, reducing reliance on edge style information. Our experiments demonstrate that GEAST-RF can achieve precise geometric structures while providing exceptional 3D stylization effects. A user survey further corroborates these experimental results, revealing that the majority of participants prefer our generated outputs compared to the most recent state-of-the-art methods.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"127 ","pages":"Article 104181"},"PeriodicalIF":2.5,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computing skeleton-based handle/tunnel loops
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-12 DOI: 10.1016/j.cag.2025.104177
Hayam Abdelrahman, Yiying Tong
Finding surface loops around narrow sections of a surface is widely used as a prepossessing step in various applications such as segmentation, shape analysis, path planning, and robotics. A common approach to locating such loops is based on surface topology. However, such geodesic loops also exist on topologically trivial genus-0 surfaces, where all such loops can continuously deform to a point. While a few existing 3D geometry-aware topological approaches may succeed in detecting such additional narrow loops, their construction can be cumbersome. To extend beyond the limitations of topologically nontrivial independent loops while remaining efficient, we propose a novel approach that leverages the shape’s skeleton for computing surface loops of handle or tunnel type. Given a closed surface mesh, our algorithm produces a practically comprehensive set of loops encircling narrow regions of the volume inside or outside the surface. Notably, our approach streamlines and expedites computations by accepting a skeleton, a 1D representation of the shape, as part of the input. Specifically, handle-type loops are discovered by examining a small subset of the skeleton points as candidate loop centers, while tunnel-type loops are identified by examining only the high-valence skeleton points.
{"title":"Computing skeleton-based handle/tunnel loops","authors":"Hayam Abdelrahman,&nbsp;Yiying Tong","doi":"10.1016/j.cag.2025.104177","DOIUrl":"10.1016/j.cag.2025.104177","url":null,"abstract":"<div><div>Finding surface loops around narrow sections of a surface is widely used as a prepossessing step in various applications such as segmentation, shape analysis, path planning, and robotics. A common approach to locating such loops is based on surface topology. However, such geodesic loops also exist on topologically trivial genus-0 surfaces, where all such loops can continuously deform to a point. While a few existing 3D geometry-aware topological approaches may succeed in detecting such additional narrow loops, their construction can be cumbersome. To extend beyond the limitations of topologically nontrivial independent loops while remaining efficient, we propose a novel approach that leverages the shape’s skeleton for computing surface loops of handle or tunnel type. Given a closed surface mesh, our algorithm produces a practically comprehensive set of loops encircling narrow regions of the volume inside or outside the surface. Notably, our approach streamlines and expedites computations by accepting a skeleton, a 1D representation of the shape, as part of the input. Specifically, handle-type loops are discovered by examining a small subset of the skeleton points as candidate loop centers, while tunnel-type loops are identified by examining only the high-valence skeleton points.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"127 ","pages":"Article 104177"},"PeriodicalIF":2.5,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-reference geometry quality assessment for colorless point clouds via list-wise rank learning
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-11 DOI: 10.1016/j.cag.2025.104176
Zheng Li , Bingxu Xie , Chao Chu , Weiqing Li , Zhiyong Su
Geometry quality assessment (GQA) of colorless point clouds is crucial for evaluating the performance of emerging point cloud-based solutions (e.g., watermarking, compression, and 3-Dimensional (3D) reconstruction). Unfortunately, existing objective GQA approaches are traditional full-reference metrics, whereas state-of-the-art learning-based point cloud quality assessment (PCQA) methods target both color and geometry distortions, neither of which are qualified for the no-reference GQA task. In addition, the lack of large-scale GQA datasets with subjective scores, which are always imprecise, biased, and inconsistent, also hinders the development of learning-based GQA metrics. Driven by these limitations, this paper proposes a no-reference geometry-only quality assessment approach based on list-wise rank learning, termed LRL-GQA, which comprises of a geometry quality assessment network (GQANet) and a list-wise rank learning network (LRLNet). The proposed LRL-GQA formulates the no-reference GQA as a list-wise rank problem, with the objective of directly optimizing the entire quality ordering. Specifically, a large dataset containing a variety of geometry-only distortions is constructed first, named LRL dataset, in which each sample is label-free but coupled with quality ranking information. Then, the GQANet is designed to capture intrinsic multi-scale patch-wise geometric features in order to predict a quality index for each point cloud. After that, the LRLNet leverages the LRL dataset and a likelihood loss to train the GQANet and ranks the input list of degraded point clouds according to their distortion levels. In addition, the pre-trained GQANet can be fine-tuned further to obtain absolute quality scores. Experimental results demonstrate the superior performance of the proposed no-reference LRL-GQA method compared with existing full-reference GQA metrics. The source code can be found at: https://github.com/VCG-NJUST/LRL-GQA.
{"title":"No-reference geometry quality assessment for colorless point clouds via list-wise rank learning","authors":"Zheng Li ,&nbsp;Bingxu Xie ,&nbsp;Chao Chu ,&nbsp;Weiqing Li ,&nbsp;Zhiyong Su","doi":"10.1016/j.cag.2025.104176","DOIUrl":"10.1016/j.cag.2025.104176","url":null,"abstract":"<div><div>Geometry quality assessment (GQA) of colorless point clouds is crucial for evaluating the performance of emerging point cloud-based solutions (e.g., watermarking, compression, and 3-Dimensional (3D) reconstruction). Unfortunately, existing objective GQA approaches are traditional full-reference metrics, whereas state-of-the-art learning-based point cloud quality assessment (PCQA) methods target both color and geometry distortions, neither of which are qualified for the no-reference GQA task. In addition, the lack of large-scale GQA datasets with subjective scores, which are always imprecise, biased, and inconsistent, also hinders the development of learning-based GQA metrics. Driven by these limitations, this paper proposes a no-reference geometry-only quality assessment approach based on list-wise rank learning, termed LRL-GQA, which comprises of a geometry quality assessment network (GQANet) and a list-wise rank learning network (LRLNet). The proposed LRL-GQA formulates the no-reference GQA as a list-wise rank problem, with the objective of directly optimizing the entire quality ordering. Specifically, a large dataset containing a variety of geometry-only distortions is constructed first, named LRL dataset, in which each sample is label-free but coupled with quality ranking information. Then, the GQANet is designed to capture intrinsic multi-scale patch-wise geometric features in order to predict a quality index for each point cloud. After that, the LRLNet leverages the LRL dataset and a likelihood loss to train the GQANet and ranks the input list of degraded point clouds according to their distortion levels. In addition, the pre-trained GQANet can be fine-tuned further to obtain absolute quality scores. Experimental results demonstrate the superior performance of the proposed no-reference LRL-GQA method compared with existing full-reference GQA metrics. The source code can be found at: <span><span>https://github.com/VCG-NJUST/LRL-GQA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"127 ","pages":"Article 104176"},"PeriodicalIF":2.5,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Triple Complementary Stream Network based on forgery feature enhancement and coupling for universal face forgery localization
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104153
Haoyu Wang , Xu Sun , Yuying Sun , Peihong Li
Existing face forgery detection methods are easily attacked by unknown facial operations and forgery techniques, and cannot accurately locate the forgery area. To solve this problem, we propose a Triple Complementary Stream Network (TCSN) for universal face forgery localization. TCSN innovatively explores universal forgery clues from the depth stream, RGB stream, and frequency stream. First, we construct a feature enhancement module that employs the features of the complementary streams to suppress semantic features and capture the universal forgery features. Subsequently, we design a dynamic affinity graph feature coupling module based on affinity propagation. This module utilizes the correlation between different stream forgery features to promote the transfer of shared and specific features across streams. TCSN achieved state-of-the-art performance on three face forgery localization datasets and demonstrated strong generalization ability. Our code and datasets are available on https://github.com/hywang02/TCSN.
{"title":"A Triple Complementary Stream Network based on forgery feature enhancement and coupling for universal face forgery localization","authors":"Haoyu Wang ,&nbsp;Xu Sun ,&nbsp;Yuying Sun ,&nbsp;Peihong Li","doi":"10.1016/j.cag.2024.104153","DOIUrl":"10.1016/j.cag.2024.104153","url":null,"abstract":"<div><div>Existing face forgery detection methods are easily attacked by unknown facial operations and forgery techniques, and cannot accurately locate the forgery area. To solve this problem, we propose a Triple Complementary Stream Network (TCSN) for universal face forgery localization. TCSN innovatively explores universal forgery clues from the depth stream, RGB stream, and frequency stream. First, we construct a feature enhancement module that employs the features of the complementary streams to suppress semantic features and capture the universal forgery features. Subsequently, we design a dynamic affinity graph feature coupling module based on affinity propagation. This module utilizes the correlation between different stream forgery features to promote the transfer of shared and specific features across streams. TCSN achieved state-of-the-art performance on three face forgery localization datasets and demonstrated strong generalization ability. Our code and datasets are available on <span><span>https://github.com/hywang02/TCSN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104153"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual demonstration of the Hubble law
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104159
András Fridvalszky, László Szirmay-Kalos
In 1929, Edwin Hubble found observational evidence for that the universe has finite age and is expanding, which provided strong support for the Big Bang theory. The idea is that if galaxies are moving away from each other now, they must have been closer together in the past, eventually leading back to a singular point. Although Hubble’s discovery can be summarized by a very simple equation, the consequences of this phenomenon is hard to imagine without visualization. This paper presents a model for the calculation of the spectral radiance in expanding spaces and a GPU-efficient visualization algorithm to demonstrate the universe expansion. The model allows for the modification of physical parameters, therefore it is appropriate for teaching and testing different scenarios.
{"title":"Visual demonstration of the Hubble law","authors":"András Fridvalszky,&nbsp;László Szirmay-Kalos","doi":"10.1016/j.cag.2024.104159","DOIUrl":"10.1016/j.cag.2024.104159","url":null,"abstract":"<div><div>In 1929, Edwin Hubble found observational evidence for that the universe has finite age and is expanding, which provided strong support for the Big Bang theory. The idea is that if galaxies are moving away from each other now, they must have been closer together in the past, eventually leading back to a singular point. Although Hubble’s discovery can be summarized by a very simple equation, the consequences of this phenomenon is hard to imagine without visualization. This paper presents a model for the calculation of the spectral radiance in expanding spaces and a GPU-efficient visualization algorithm to demonstrate the universe expansion. The model allows for the modification of physical parameters, therefore it is appropriate for teaching and testing different scenarios.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104159"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three dimensional forest dynamic evolution based on hydraulic erosion and forest fire disturbance
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104152
Qingkuo Meng, Yongjian Huai, Xiaoying Wang, Ziyang Li, Rui Zhang, Xiaoying Nie
Forest ecosystems can change due to both human activities and climatic factors, particularly shifts in temperature, rain, and wind patterns. Topographic changes caused by rains and structural shifts induced by forest fires represent two primary disturbance events in forest environments. These disturbances are influenced by weather factors and exhibit complex effects on forest dynamics, characterized by regional, seasonal, and stochastic variations. Consequently, examining the interactions between weather patterns and forest evolution through computer graphics holds significant research value. Vegetation and terrain modeling are fundamental to generating realistic forest landscapes. We employ physically-based procedural erosion to simulate geomorphological erosion processes, while further exploring vegetation-terrain interactions to create high-resolution landscapes. Using data from real forest landscapes, we incorporate fire ignition points to simulate forest fire occurrence and spread by modeling wildfire combustion and heat transfer processes, which accurately capture fire dynamics. This enables the simulation of forest fire scenarios under various environmental conditions, allowing us to assess the combined impacts of rainfall and forest fires on forest landscapes. Additionally, the model ensures real-time interaction, supporting the creation of immersive and responsive landscape simulations.
{"title":"Three dimensional forest dynamic evolution based on hydraulic erosion and forest fire disturbance","authors":"Qingkuo Meng,&nbsp;Yongjian Huai,&nbsp;Xiaoying Wang,&nbsp;Ziyang Li,&nbsp;Rui Zhang,&nbsp;Xiaoying Nie","doi":"10.1016/j.cag.2024.104152","DOIUrl":"10.1016/j.cag.2024.104152","url":null,"abstract":"<div><div>Forest ecosystems can change due to both human activities and climatic factors, particularly shifts in temperature, rain, and wind patterns. Topographic changes caused by rains and structural shifts induced by forest fires represent two primary disturbance events in forest environments. These disturbances are influenced by weather factors and exhibit complex effects on forest dynamics, characterized by regional, seasonal, and stochastic variations. Consequently, examining the interactions between weather patterns and forest evolution through computer graphics holds significant research value. Vegetation and terrain modeling are fundamental to generating realistic forest landscapes. We employ physically-based procedural erosion to simulate geomorphological erosion processes, while further exploring vegetation-terrain interactions to create high-resolution landscapes. Using data from real forest landscapes, we incorporate fire ignition points to simulate forest fire occurrence and spread by modeling wildfire combustion and heat transfer processes, which accurately capture fire dynamics. This enables the simulation of forest fire scenarios under various environmental conditions, allowing us to assess the combined impacts of rainfall and forest fires on forest landscapes. Additionally, the model ensures real-time interaction, supporting the creation of immersive and responsive landscape simulations.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104152"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MT-NeRF: Neural implicit representation based on multi-resolution geometric feature planes
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104157
Wanqi Jiang, Yafei Liu, Mujiao Ouyang, Xiaoguo Zhang
Reconstructing an indoor-scale scene from scratch is a difficult task when the camera pose is unknown. If it is also required to achieve fast convergence without sacrificing quality and ensure low memory usage at the same time, this work will be even more challenging. In this paper, we propose MT-NeRF, a novel radiance field rendering method based on RGB-D inputs without pre-computed camera poses. MT-NeRF maps indoor scenes at real-world scales to multi-resolution geometric feature planes, which greatly reduces memory footprint and enhances detailed scene fitting. In addition, MT-NeRF significantly enhances the localization accuracy of the system by introducing a photometric distortion loss based on interframe surface pixels. For keyframe selection, MT-NeRF employs a global-to-local keyframe selection strategy, which markedly enhances the global consistency of scene reconstruction. Experiments are designed and conducted to validate the effectiveness of MT-NeRF in scenarios involving complex motion or noisy depth map inputs. The results demonstrate remarkable improvements in scene reconstruction quality and pose estimation accuracy, all while ensuring a low memory footprint. At the same time, our method achieves a speedup of approximately fivefold.
{"title":"MT-NeRF: Neural implicit representation based on multi-resolution geometric feature planes","authors":"Wanqi Jiang,&nbsp;Yafei Liu,&nbsp;Mujiao Ouyang,&nbsp;Xiaoguo Zhang","doi":"10.1016/j.cag.2024.104157","DOIUrl":"10.1016/j.cag.2024.104157","url":null,"abstract":"<div><div>Reconstructing an indoor-scale scene from scratch is a difficult task when the camera pose is unknown. If it is also required to achieve fast convergence without sacrificing quality and ensure low memory usage at the same time, this work will be even more challenging. In this paper, we propose MT-NeRF, a novel radiance field rendering method based on RGB-D inputs without pre-computed camera poses. MT-NeRF maps indoor scenes at real-world scales to multi-resolution geometric feature planes, which greatly reduces memory footprint and enhances detailed scene fitting. In addition, MT-NeRF significantly enhances the localization accuracy of the system by introducing a photometric distortion loss based on interframe surface pixels. For keyframe selection, MT-NeRF employs a global-to-local keyframe selection strategy, which markedly enhances the global consistency of scene reconstruction. Experiments are designed and conducted to validate the effectiveness of MT-NeRF in scenarios involving complex motion or noisy depth map inputs. The results demonstrate remarkable improvements in scene reconstruction quality and pose estimation accuracy, all while ensuring a low memory footprint. At the same time, our method achieves a speedup of approximately fivefold.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104157"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample-efficient reference-free control strategy for multi-legged locomotion
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104141
Gangrae Park , Jaepyung Hwang , Taesoo Kwon
Locomotion is one of the fundamental skills that is challenging to simulate in a manner that generalizes across a wide range of speeds and turning capabilities. In this paper, our goal is to develop a versatile locomotion controller applicable to various multi-legged character models (monopod, biped, and quadruped), enabling them to perform a range of tasks such as speed control, steering, moving to target locations, and slope walking. Our method is capable of generating diverse multi-legged locomotions without the need for reference motions, even when faced with the inherent challenge of coordinating multiple legs simultaneously. Based on deep reinforcement learning, we train our policy network to produce desired feet locations and orientations, enhancing sample efficiency and robustness compared to the commonly used joint angles. Utilizing end-effector configurations allows for intuitive adaptation to various locomotion gaits. Additionally, we design a style reward function that is applicable to different types of multi-legged models. The locomotion controller, trained with this reward, effectively performs given tasks in a physically simulated environment while maintaining the naturalness of locomotion.
{"title":"Sample-efficient reference-free control strategy for multi-legged locomotion","authors":"Gangrae Park ,&nbsp;Jaepyung Hwang ,&nbsp;Taesoo Kwon","doi":"10.1016/j.cag.2024.104141","DOIUrl":"10.1016/j.cag.2024.104141","url":null,"abstract":"<div><div>Locomotion is one of the fundamental skills that is challenging to simulate in a manner that generalizes across a wide range of speeds and turning capabilities. In this paper, our goal is to develop a versatile locomotion controller applicable to various multi-legged character models (monopod, biped, and quadruped), enabling them to perform a range of tasks such as speed control, steering, moving to target locations, and slope walking. Our method is capable of generating diverse multi-legged locomotions without the need for reference motions, even when faced with the inherent challenge of coordinating multiple legs simultaneously. Based on deep reinforcement learning, we train our policy network to produce desired feet locations and orientations, enhancing sample efficiency and robustness compared to the commonly used joint angles. Utilizing end-effector configurations allows for intuitive adaptation to various locomotion gaits. Additionally, we design a style reward function that is applicable to different types of multi-legged models. The locomotion controller, trained with this reward, effectively performs given tasks in a physically simulated environment while maintaining the naturalness of locomotion.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104141"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised single-view 3D reconstruction via multi shape prior fusion strategy and self-attention
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104142
Wei Zhou , Xinzhe Shi , Yunfeng She , Kunlong Liu , Yongqin Zhang
In the domain of single-view 3D reconstruction, traditional techniques have frequently relied on expensive and time-intensive 3D annotation data. Facing the challenge of annotation acquisition, semi-supervised learning strategies offer an innovative approach to reduce the dependence on labeled data. Despite these developments, the utilization of this learning paradigm in 3D reconstruction tasks remains relatively constrained. In this research, we created an innovative semi-supervised framework for 3D reconstruction that distinctively uniquely introduces a multi shape prior fusion strategy, intending to guide the creation of more realistic object structures. Additionally, to improve the quality of shape generation, we integrated a self-attention module into the traditional decoder. In benchmark tests on the ShapeNet dataset, our method substantially outperformed existing supervised learning methods at diverse labeled ratios of 1%, 10%, and 20%. Moreover, it showcased excellent performance on the real-world Pix3D dataset. Through comprehensive experiments on ShapeNet, our framework demonstrated a 3.3% performance improvement over the baseline. Moreover, stringent ablation studies further confirmed the notable effectiveness of our approach. Our code has been released on https://github.com/NWUzhouwei/SSMP.
{"title":"Semi-supervised single-view 3D reconstruction via multi shape prior fusion strategy and self-attention","authors":"Wei Zhou ,&nbsp;Xinzhe Shi ,&nbsp;Yunfeng She ,&nbsp;Kunlong Liu ,&nbsp;Yongqin Zhang","doi":"10.1016/j.cag.2024.104142","DOIUrl":"10.1016/j.cag.2024.104142","url":null,"abstract":"<div><div>In the domain of single-view 3D reconstruction, traditional techniques have frequently relied on expensive and time-intensive 3D annotation data. Facing the challenge of annotation acquisition, semi-supervised learning strategies offer an innovative approach to reduce the dependence on labeled data. Despite these developments, the utilization of this learning paradigm in 3D reconstruction tasks remains relatively constrained. In this research, we created an innovative semi-supervised framework for 3D reconstruction that distinctively uniquely introduces a multi shape prior fusion strategy, intending to guide the creation of more realistic object structures. Additionally, to improve the quality of shape generation, we integrated a self-attention module into the traditional decoder. In benchmark tests on the ShapeNet dataset, our method substantially outperformed existing supervised learning methods at diverse labeled ratios of 1%, 10%, and 20%. Moreover, it showcased excellent performance on the real-world Pix3D dataset. Through comprehensive experiments on ShapeNet, our framework demonstrated a 3.3% performance improvement over the baseline. Moreover, stringent ablation studies further confirmed the notable effectiveness of our approach. Our code has been released on <span><span>https://github.com/NWUzhouwei/SSMP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104142"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Encoding context and decoding aggregated information for semantic segmentation
IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING Pub Date : 2025-02-01 DOI: 10.1016/j.cag.2024.104144
Guodong Zhang , Wenzhu Yang , Guoyu Zhou
In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.
{"title":"Encoding context and decoding aggregated information for semantic segmentation","authors":"Guodong Zhang ,&nbsp;Wenzhu Yang ,&nbsp;Guoyu Zhou","doi":"10.1016/j.cag.2024.104144","DOIUrl":"10.1016/j.cag.2024.104144","url":null,"abstract":"<div><div>In the process of feature extraction, the existing encoder–decoder network models often use the continuous downsampling method to expand the receptive field, which directly leads to the reduction of the model’s ability to obtain fine-grained spatial information, and further makes it difficult to recover the lost spatial details in the process of upsampling the feature map. In addition, the direct fusion of the features of the encoder and the decoder leads to the problem that the detailed features are masked by the semantic features. To address these challenges, we build a new semantic segmentation model named ECDAISeg. The model adopts an encoder–decoder structure. We embed Context Propagation Module (CPM) and Blend Feature Balance Module (BFBM) between the encoder and the decoder. The role of CMP is to recover lost detail information after feature extraction and to provide multi-scale contextual information for better understanding of global semantics. The BFBM is used to balance high-level semantic information with low-level detailed information through the attention mechanism, thereby filtering out redundant information and preserving important details. Evaluations on the PASCAL VOC 2012 and Cityscapes validation sets conclude that ECDAISeg achieves 82.85% and 74.49% mIoU, realizing better segmentation results compared to various representative segmentation models.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"126 ","pages":"Article 104144"},"PeriodicalIF":2.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143096900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Computers & Graphics-Uk
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1