{"title":"第 34.6 期社论","authors":"Nadia Magnenat Thalmann, Daniel Thalmann","doi":"10.1002/cav.2227","DOIUrl":null,"url":null,"abstract":"<p>This issue contains 12 regular papers. In the first paper, Hong Li et al. present an animation translation method based on edge enhancement and coordinate attention, which is called FAEC-GAN. They design a novel edge discrimination network to identify the edge features of images, so that the generated anime images can present clear and coherent lines. And the coordinate attention module is introduced in the encoder to adapt the model to the geometric changes in translation, to produce more realistic animation images. In addition, the method combines the focal frequency loss and pixel loss, which can pay attention to both the frequency domain information and pixel information of the generated image to improve the visual effect of the image.</p>\n<p>In the second paper, Rahul Jain et al. propose an algorithm to convert a depth video into a single dynamic image known as a linked motion image (LMI). The LMI has been given to a classifier consisting of an ensemble of three modified pre-trained convolutional neural networks (CNNs). The experiments were conducted using two datasets: a multimodal large-scale EgoGesture dataset and The MSR Gesture 3D dataset. For the EgoGesture dataset, the proposed method achieved an accuracy of 92.91%, which is better than the state-of-the-art methods. For the MSR Gesture 3D dataset, the proposed method accuracy is 100%, which outperforms the state-of-the-art methods. The recognition accuracy and precision of each gesture are also highlighted in this work.</p>\n<p>In the third paper, Rustam Akhunov et al. propose a set of experiments to aid the evaluation of the main categories of fluid-boundary interactions that are important in computer animation, i.e. no motion (resting) fluid, tangential and normal motion of a fluid with respect to the boundary, and a fluid impacting a corner. They propose 10 experiments, comprising experimental setup and quantitative evaluation with optional visual inspections, that are arranged in four groups which focus on one of the main category of fluid-boundary interactions. The authors use these experiments to evaluate three particle-based boundary handling methods, that is, Pressure Mirroring (PM), Pressure Boundaries (PB) and Moving Least Squares Pressure Extrapolation (MLS), in combination with two incompressible SPH fluid simulation methods, namely IISPH and DFSPH.</p>\n<p>In the fourth paper, Shenghuan Zhao et al. present three Extended Reality (XR) apps (AR, MR, and VR) to interactively visualize façade fenestration geometries and indoor illuminance simulations. Then XR technologies are assessed by 120 students and young architects, from task performance and engagement level two aspects. The task performance is measured by correct rate and time consumption two indicators, while the engagement level is measured by usability and interest two indicators. Evaluation results show that compared to AR and VR, MR is the best XR technology for this aim. VR outperforms AR on three indicators except for usability. By exposing three different XR technologies' performances in aiding fenestration design, this study increases the practical value of applying XR to the building design field.</p>\n<p>The fifth paper by Jing Zhao et al. focuses on a multiple-fluid coupling simulation algorithm based on MPM and PFM. First, based on the MPM, they model multiphase flow on Eulerian grids and capture the sharp interfaces between immiscible fluids combined with the PFM. The gas phase is further treated as a fluid during the gas–liquid interaction. Second, to demonstrate the natural fluid moving evolution from the high energy state to the low energy state, the paper proposes the local minimize bulk energy function to control the low energy state. Finally, the paper designs and achieves various groups of multiple-fluid coupling comparison experiments. Experimental results showed that the proposed approach can simulate various effects of rapid diffusion in the multiple-fluid coupling, such as complete dissolution, mutual solubility, extraction, and other phenomena.</p>\n<p>In the sixth paper, Jiwei Zhang et al. propose a novel method fusing multiple heterogeneous features through a multi-feature subspace representation network (MFSRN) to maximize the classification performance while keeping the disparity among features as small as possible, that is, common-subspace constraints. The authors conducted comparative experiments with state-of-the-art models on the bird's-eye view person dataset, and extensive experimental results demonstrated that the proposed MFSRN could achieve better recognition performance. Furthermore, the validity and stability of the method are confirmed.</p>\n<p>In the seventh paper, Sahadeb Shit et al. propose a convolutional neural network (CNN)-based image dehazing and detection approach, called End to End Dehaze and Detection Network (EDD-N), for proper image visualization and detection. This network is trained on real-time hazy images that are directly used to recover dehaze images without a transmission map. EDD-N is robust, and accuracy is higher than any other proposed model. The authors also conducted extensive experiments using real-time foggy images. The quantitative and qualitative evaluations of the hazy dataset verify the proposed method's superiority over other dehazing methods. Moreover, the proposed method validated real-time object detection tasks in adverse weather conditions and improved the intelligent transportation system.</p>\n<p>In the eighth paper, Chaehan So et al. designed a virtual being from a deep learning-generated face and a conversational AI model acting as a virtual conversation partner in an online conferencing software and evaluated it in 11 perceptions of social attributes. Compared to prior expectations, participants perceived the virtual being as distinctly higher in warmth (engaging, empathic, and approachable) but lower in realism and credibility after 5 days of 10-min daily conversations (Study 1). Further, the authors explored the idea of simplifying the technical setup to reduce the technical entry barrier for such experiments (Study 2). To this aim, they conducted several trials of fine-tuning a small conversational model of 90 million parameters until its performance metrics improved. Testing this fine-tuned model with users revealed that this model was not perceived differently from a large conversational model.</p>\n<p>In the ninth paper, Di Qi et al. propose a novel split and join approach to simulate a side-to-side stapled intestinal anastomosis in virtual reality. They mimic the intestine model using a new hybrid representation—a grid-linked particles model for physics simulation and a surface mesh for rendering. The proposed split and join operations handle the updates of both the grid-linked particles model and the surface mesh during the anastomosis procedure. The simulation results demonstrate the feasibility of the proposed approach in simulating intestine models and the side-to-side anastomosis operation.</p>\n<p>The tenth paper by Lanfeng Zhou et al. present a novel graph convolution mixed with point cloud deep learning method. In this method, the skinned multi-person linear model is regarded as a graph structure input, and the coarsened graph is obtained by graph convolution. After feeding the coarsened graph into the PointNet network, the coordinates of Dazhui are output. Different from the existing methods, the proposed method can directly label the results on the adaptive model, thus improving the accuracy on different models. An optimization method based on graph structure is introduced for better fit the predicted acupoints to the surface. In addition, a dataset marked with Dazhui is constructed for training. Experiments show that the accuracy of positioning could meet the requirements of needle application under certain circumstances.</p>\n<p>In the eleventh paper, Jian Lu et al. propose to reduce the interference of the factors in skeleton-based action recognition, by taking the joint coordinate modal information of 2D skeleton to represent the change of human posture as the research point: first, the joint coordinates are obtained from RGB video or image using a detector. Then the feature extraction network is combined to perform multi-level feature learning to establish correspondence between actions and corresponding multi-level features. Finally, the hierarchical attention mechanism is introduced to design the CHAN model. By calculating the association between elements, the weight of the action classification is redistributed. The proposed method has good performance on UT-Kinect, KTH and NTU RGB + D datasets.</p>\n<p>The last paper by Numan Ali et al. first conducted a subjective study with field experts to investigate about the practical implementation of their existing virtual chemistry laboratory (VCL). To consider the suggestions of the field experts, they propose task specific aids based virtual reality chemistry laboratory (TSA-VRCL) to minimize students' cognitive load and enhance their performance. The task specific aids consist of an arrow, animation and audio aids that are separately rendered with each step of the experimental tasks. During evaluations, 80 students performed the experiments in four different groups using four different experimental conditions. Evaluations revealed that the proposed TSA-VRCL minimizes students' cognitive load and enhances their performance.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"18 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Editorial issue 34.6\",\"authors\":\"Nadia Magnenat Thalmann, Daniel Thalmann\",\"doi\":\"10.1002/cav.2227\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>This issue contains 12 regular papers. In the first paper, Hong Li et al. present an animation translation method based on edge enhancement and coordinate attention, which is called FAEC-GAN. They design a novel edge discrimination network to identify the edge features of images, so that the generated anime images can present clear and coherent lines. And the coordinate attention module is introduced in the encoder to adapt the model to the geometric changes in translation, to produce more realistic animation images. In addition, the method combines the focal frequency loss and pixel loss, which can pay attention to both the frequency domain information and pixel information of the generated image to improve the visual effect of the image.</p>\\n<p>In the second paper, Rahul Jain et al. propose an algorithm to convert a depth video into a single dynamic image known as a linked motion image (LMI). The LMI has been given to a classifier consisting of an ensemble of three modified pre-trained convolutional neural networks (CNNs). The experiments were conducted using two datasets: a multimodal large-scale EgoGesture dataset and The MSR Gesture 3D dataset. For the EgoGesture dataset, the proposed method achieved an accuracy of 92.91%, which is better than the state-of-the-art methods. For the MSR Gesture 3D dataset, the proposed method accuracy is 100%, which outperforms the state-of-the-art methods. The recognition accuracy and precision of each gesture are also highlighted in this work.</p>\\n<p>In the third paper, Rustam Akhunov et al. propose a set of experiments to aid the evaluation of the main categories of fluid-boundary interactions that are important in computer animation, i.e. no motion (resting) fluid, tangential and normal motion of a fluid with respect to the boundary, and a fluid impacting a corner. They propose 10 experiments, comprising experimental setup and quantitative evaluation with optional visual inspections, that are arranged in four groups which focus on one of the main category of fluid-boundary interactions. The authors use these experiments to evaluate three particle-based boundary handling methods, that is, Pressure Mirroring (PM), Pressure Boundaries (PB) and Moving Least Squares Pressure Extrapolation (MLS), in combination with two incompressible SPH fluid simulation methods, namely IISPH and DFSPH.</p>\\n<p>In the fourth paper, Shenghuan Zhao et al. present three Extended Reality (XR) apps (AR, MR, and VR) to interactively visualize façade fenestration geometries and indoor illuminance simulations. Then XR technologies are assessed by 120 students and young architects, from task performance and engagement level two aspects. The task performance is measured by correct rate and time consumption two indicators, while the engagement level is measured by usability and interest two indicators. Evaluation results show that compared to AR and VR, MR is the best XR technology for this aim. VR outperforms AR on three indicators except for usability. By exposing three different XR technologies' performances in aiding fenestration design, this study increases the practical value of applying XR to the building design field.</p>\\n<p>The fifth paper by Jing Zhao et al. focuses on a multiple-fluid coupling simulation algorithm based on MPM and PFM. First, based on the MPM, they model multiphase flow on Eulerian grids and capture the sharp interfaces between immiscible fluids combined with the PFM. The gas phase is further treated as a fluid during the gas–liquid interaction. Second, to demonstrate the natural fluid moving evolution from the high energy state to the low energy state, the paper proposes the local minimize bulk energy function to control the low energy state. Finally, the paper designs and achieves various groups of multiple-fluid coupling comparison experiments. Experimental results showed that the proposed approach can simulate various effects of rapid diffusion in the multiple-fluid coupling, such as complete dissolution, mutual solubility, extraction, and other phenomena.</p>\\n<p>In the sixth paper, Jiwei Zhang et al. propose a novel method fusing multiple heterogeneous features through a multi-feature subspace representation network (MFSRN) to maximize the classification performance while keeping the disparity among features as small as possible, that is, common-subspace constraints. The authors conducted comparative experiments with state-of-the-art models on the bird's-eye view person dataset, and extensive experimental results demonstrated that the proposed MFSRN could achieve better recognition performance. Furthermore, the validity and stability of the method are confirmed.</p>\\n<p>In the seventh paper, Sahadeb Shit et al. propose a convolutional neural network (CNN)-based image dehazing and detection approach, called End to End Dehaze and Detection Network (EDD-N), for proper image visualization and detection. This network is trained on real-time hazy images that are directly used to recover dehaze images without a transmission map. EDD-N is robust, and accuracy is higher than any other proposed model. The authors also conducted extensive experiments using real-time foggy images. The quantitative and qualitative evaluations of the hazy dataset verify the proposed method's superiority over other dehazing methods. Moreover, the proposed method validated real-time object detection tasks in adverse weather conditions and improved the intelligent transportation system.</p>\\n<p>In the eighth paper, Chaehan So et al. designed a virtual being from a deep learning-generated face and a conversational AI model acting as a virtual conversation partner in an online conferencing software and evaluated it in 11 perceptions of social attributes. Compared to prior expectations, participants perceived the virtual being as distinctly higher in warmth (engaging, empathic, and approachable) but lower in realism and credibility after 5 days of 10-min daily conversations (Study 1). Further, the authors explored the idea of simplifying the technical setup to reduce the technical entry barrier for such experiments (Study 2). To this aim, they conducted several trials of fine-tuning a small conversational model of 90 million parameters until its performance metrics improved. Testing this fine-tuned model with users revealed that this model was not perceived differently from a large conversational model.</p>\\n<p>In the ninth paper, Di Qi et al. propose a novel split and join approach to simulate a side-to-side stapled intestinal anastomosis in virtual reality. They mimic the intestine model using a new hybrid representation—a grid-linked particles model for physics simulation and a surface mesh for rendering. The proposed split and join operations handle the updates of both the grid-linked particles model and the surface mesh during the anastomosis procedure. The simulation results demonstrate the feasibility of the proposed approach in simulating intestine models and the side-to-side anastomosis operation.</p>\\n<p>The tenth paper by Lanfeng Zhou et al. present a novel graph convolution mixed with point cloud deep learning method. In this method, the skinned multi-person linear model is regarded as a graph structure input, and the coarsened graph is obtained by graph convolution. After feeding the coarsened graph into the PointNet network, the coordinates of Dazhui are output. Different from the existing methods, the proposed method can directly label the results on the adaptive model, thus improving the accuracy on different models. An optimization method based on graph structure is introduced for better fit the predicted acupoints to the surface. In addition, a dataset marked with Dazhui is constructed for training. Experiments show that the accuracy of positioning could meet the requirements of needle application under certain circumstances.</p>\\n<p>In the eleventh paper, Jian Lu et al. propose to reduce the interference of the factors in skeleton-based action recognition, by taking the joint coordinate modal information of 2D skeleton to represent the change of human posture as the research point: first, the joint coordinates are obtained from RGB video or image using a detector. Then the feature extraction network is combined to perform multi-level feature learning to establish correspondence between actions and corresponding multi-level features. Finally, the hierarchical attention mechanism is introduced to design the CHAN model. By calculating the association between elements, the weight of the action classification is redistributed. The proposed method has good performance on UT-Kinect, KTH and NTU RGB + D datasets.</p>\\n<p>The last paper by Numan Ali et al. first conducted a subjective study with field experts to investigate about the practical implementation of their existing virtual chemistry laboratory (VCL). To consider the suggestions of the field experts, they propose task specific aids based virtual reality chemistry laboratory (TSA-VRCL) to minimize students' cognitive load and enhance their performance. The task specific aids consist of an arrow, animation and audio aids that are separately rendered with each step of the experimental tasks. During evaluations, 80 students performed the experiments in four different groups using four different experimental conditions. Evaluations revealed that the proposed TSA-VRCL minimizes students' cognitive load and enhances their performance.</p>\",\"PeriodicalId\":50645,\"journal\":{\"name\":\"Computer Animation and Virtual Worlds\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2023-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Animation and Virtual Worlds\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1002/cav.2227\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/cav.2227","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
This issue contains 12 regular papers. In the first paper, Hong Li et al. present an animation translation method based on edge enhancement and coordinate attention, which is called FAEC-GAN. They design a novel edge discrimination network to identify the edge features of images, so that the generated anime images can present clear and coherent lines. And the coordinate attention module is introduced in the encoder to adapt the model to the geometric changes in translation, to produce more realistic animation images. In addition, the method combines the focal frequency loss and pixel loss, which can pay attention to both the frequency domain information and pixel information of the generated image to improve the visual effect of the image.
In the second paper, Rahul Jain et al. propose an algorithm to convert a depth video into a single dynamic image known as a linked motion image (LMI). The LMI has been given to a classifier consisting of an ensemble of three modified pre-trained convolutional neural networks (CNNs). The experiments were conducted using two datasets: a multimodal large-scale EgoGesture dataset and The MSR Gesture 3D dataset. For the EgoGesture dataset, the proposed method achieved an accuracy of 92.91%, which is better than the state-of-the-art methods. For the MSR Gesture 3D dataset, the proposed method accuracy is 100%, which outperforms the state-of-the-art methods. The recognition accuracy and precision of each gesture are also highlighted in this work.
In the third paper, Rustam Akhunov et al. propose a set of experiments to aid the evaluation of the main categories of fluid-boundary interactions that are important in computer animation, i.e. no motion (resting) fluid, tangential and normal motion of a fluid with respect to the boundary, and a fluid impacting a corner. They propose 10 experiments, comprising experimental setup and quantitative evaluation with optional visual inspections, that are arranged in four groups which focus on one of the main category of fluid-boundary interactions. The authors use these experiments to evaluate three particle-based boundary handling methods, that is, Pressure Mirroring (PM), Pressure Boundaries (PB) and Moving Least Squares Pressure Extrapolation (MLS), in combination with two incompressible SPH fluid simulation methods, namely IISPH and DFSPH.
In the fourth paper, Shenghuan Zhao et al. present three Extended Reality (XR) apps (AR, MR, and VR) to interactively visualize façade fenestration geometries and indoor illuminance simulations. Then XR technologies are assessed by 120 students and young architects, from task performance and engagement level two aspects. The task performance is measured by correct rate and time consumption two indicators, while the engagement level is measured by usability and interest two indicators. Evaluation results show that compared to AR and VR, MR is the best XR technology for this aim. VR outperforms AR on three indicators except for usability. By exposing three different XR technologies' performances in aiding fenestration design, this study increases the practical value of applying XR to the building design field.
The fifth paper by Jing Zhao et al. focuses on a multiple-fluid coupling simulation algorithm based on MPM and PFM. First, based on the MPM, they model multiphase flow on Eulerian grids and capture the sharp interfaces between immiscible fluids combined with the PFM. The gas phase is further treated as a fluid during the gas–liquid interaction. Second, to demonstrate the natural fluid moving evolution from the high energy state to the low energy state, the paper proposes the local minimize bulk energy function to control the low energy state. Finally, the paper designs and achieves various groups of multiple-fluid coupling comparison experiments. Experimental results showed that the proposed approach can simulate various effects of rapid diffusion in the multiple-fluid coupling, such as complete dissolution, mutual solubility, extraction, and other phenomena.
In the sixth paper, Jiwei Zhang et al. propose a novel method fusing multiple heterogeneous features through a multi-feature subspace representation network (MFSRN) to maximize the classification performance while keeping the disparity among features as small as possible, that is, common-subspace constraints. The authors conducted comparative experiments with state-of-the-art models on the bird's-eye view person dataset, and extensive experimental results demonstrated that the proposed MFSRN could achieve better recognition performance. Furthermore, the validity and stability of the method are confirmed.
In the seventh paper, Sahadeb Shit et al. propose a convolutional neural network (CNN)-based image dehazing and detection approach, called End to End Dehaze and Detection Network (EDD-N), for proper image visualization and detection. This network is trained on real-time hazy images that are directly used to recover dehaze images without a transmission map. EDD-N is robust, and accuracy is higher than any other proposed model. The authors also conducted extensive experiments using real-time foggy images. The quantitative and qualitative evaluations of the hazy dataset verify the proposed method's superiority over other dehazing methods. Moreover, the proposed method validated real-time object detection tasks in adverse weather conditions and improved the intelligent transportation system.
In the eighth paper, Chaehan So et al. designed a virtual being from a deep learning-generated face and a conversational AI model acting as a virtual conversation partner in an online conferencing software and evaluated it in 11 perceptions of social attributes. Compared to prior expectations, participants perceived the virtual being as distinctly higher in warmth (engaging, empathic, and approachable) but lower in realism and credibility after 5 days of 10-min daily conversations (Study 1). Further, the authors explored the idea of simplifying the technical setup to reduce the technical entry barrier for such experiments (Study 2). To this aim, they conducted several trials of fine-tuning a small conversational model of 90 million parameters until its performance metrics improved. Testing this fine-tuned model with users revealed that this model was not perceived differently from a large conversational model.
In the ninth paper, Di Qi et al. propose a novel split and join approach to simulate a side-to-side stapled intestinal anastomosis in virtual reality. They mimic the intestine model using a new hybrid representation—a grid-linked particles model for physics simulation and a surface mesh for rendering. The proposed split and join operations handle the updates of both the grid-linked particles model and the surface mesh during the anastomosis procedure. The simulation results demonstrate the feasibility of the proposed approach in simulating intestine models and the side-to-side anastomosis operation.
The tenth paper by Lanfeng Zhou et al. present a novel graph convolution mixed with point cloud deep learning method. In this method, the skinned multi-person linear model is regarded as a graph structure input, and the coarsened graph is obtained by graph convolution. After feeding the coarsened graph into the PointNet network, the coordinates of Dazhui are output. Different from the existing methods, the proposed method can directly label the results on the adaptive model, thus improving the accuracy on different models. An optimization method based on graph structure is introduced for better fit the predicted acupoints to the surface. In addition, a dataset marked with Dazhui is constructed for training. Experiments show that the accuracy of positioning could meet the requirements of needle application under certain circumstances.
In the eleventh paper, Jian Lu et al. propose to reduce the interference of the factors in skeleton-based action recognition, by taking the joint coordinate modal information of 2D skeleton to represent the change of human posture as the research point: first, the joint coordinates are obtained from RGB video or image using a detector. Then the feature extraction network is combined to perform multi-level feature learning to establish correspondence between actions and corresponding multi-level features. Finally, the hierarchical attention mechanism is introduced to design the CHAN model. By calculating the association between elements, the weight of the action classification is redistributed. The proposed method has good performance on UT-Kinect, KTH and NTU RGB + D datasets.
The last paper by Numan Ali et al. first conducted a subjective study with field experts to investigate about the practical implementation of their existing virtual chemistry laboratory (VCL). To consider the suggestions of the field experts, they propose task specific aids based virtual reality chemistry laboratory (TSA-VRCL) to minimize students' cognitive load and enhance their performance. The task specific aids consist of an arrow, animation and audio aids that are separately rendered with each step of the experimental tasks. During evaluations, 80 students performed the experiments in four different groups using four different experimental conditions. Evaluations revealed that the proposed TSA-VRCL minimizes students' cognitive load and enhances their performance.
期刊介绍:
With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.