Pub Date : 2023-10-01DOI: 10.1016/j.vrih.2023.02.001
Ajune Wanis Ismail , Muhammad Akma Iman
Holograms provide a characteristic manner to display and convey information, and have been improved to provide better user interactions Holographic interactions are important as they improve user interactions with virtual objects. Gesture interaction is a recent research topic, as it allows users to use their bare hands to directly interact with the hologram. However, it remains unclear whether real hand gestures are well suited for hologram applications. Therefore, we discuss the development process and implementation of three-dimensional object manipulation using natural hand gestures in a hologram. We describe the design and development process for hologram applications and its integration with real hand gesture interactions as initial findings. Experimental results from Nasa TLX form are discussed. Based on the findings, we actualize the user interactions in the hologram.
{"title":"Implementation of natural hand gestures in holograms for 3D object manipulation","authors":"Ajune Wanis Ismail , Muhammad Akma Iman","doi":"10.1016/j.vrih.2023.02.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.02.001","url":null,"abstract":"<div><p>Holograms provide a characteristic manner to display and convey information, and have been improved to provide better user interactions Holographic interactions are important as they improve user interactions with virtual objects. Gesture interaction is a recent research topic, as it allows users to use their bare hands to directly interact with the hologram. However, it remains unclear whether real hand gestures are well suited for hologram applications. Therefore, we discuss the development process and implementation of three-dimensional object manipulation using natural hand gestures in a hologram. We describe the design and development process for hologram applications and its integration with real hand gesture interactions as initial findings. Experimental results from Nasa TLX form are discussed. Based on the findings, we actualize the user interactions in the hologram.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 5","pages":"Pages 439-450"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71728995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.vrih.2020.02.002
Xiaojun Liu , Jinyuan Jia , Chang Liu
Background
With the rapid development of Web3D technologies, the online Web3D visualization, particularly for complex models or scenes, has been in a great demand. Owing to the major conflict between the Web3D system load and resource consumption in the processing of these huge models, the huge 3D model lightweighting methods for online Web3D visualization are reviewed in this paper.
Methods
By observing the geometry redundancy introduced by man-made operations in the modeling procedure, several categories of lightweighting related work that aim at reducing the amount of data and resource consumption are elaborated for Web3D visualization.
Results
By comparing perspectives, the characteristics of each method are summarized, and among the reviewed methods, the geometric redundancy removal that achieves the lightweight goal by detecting and removing the repeated components is an appropriate method for current online Web3D visualization. Meanwhile, the learning algorithm, still in improvement period at present, is our expected future research topic.
Conclusions
Various aspects should be considered in an efficient lightweight method for online Web3D visualization, such as characteristics of original data, combination or extension of existing methods, scheduling strategy, cache management, and rendering mechanism. Meanwhile, innovation methods, particularly the learning algorithm, are worth exploring.
{"title":"Survey of lightweighting methods of huge 3D models for online Web3D visualization","authors":"Xiaojun Liu , Jinyuan Jia , Chang Liu","doi":"10.1016/j.vrih.2020.02.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2020.02.002","url":null,"abstract":"<div><h3>Background</h3><p>With the rapid development of Web3D technologies, the online Web3D visualization, particularly for complex models or scenes, has been in a great demand. Owing to the major conflict between the Web3D system load and resource consumption in the processing of these huge models, the huge 3D model lightweighting methods for online Web3D visualization are reviewed in this paper.</p></div><div><h3>Methods</h3><p>By observing the geometry redundancy introduced by man-made operations in the modeling procedure, several categories of lightweighting related work that aim at reducing the amount of data and resource consumption are elaborated for Web3D visualization.</p></div><div><h3>Results</h3><p>By comparing perspectives, the characteristics of each method are summarized, and among the reviewed methods, the geometric redundancy removal that achieves the lightweight goal by detecting and removing the repeated components is an appropriate method for current online Web3D visualization. Meanwhile, the learning algorithm, still in improvement period at present, is our expected future research topic.</p></div><div><h3>Conclusions</h3><p>Various aspects should be considered in an efficient lightweight method for online Web3D visualization, such as characteristics of original data, combination or extension of existing methods, scheduling strategy, cache management, and rendering mechanism. Meanwhile, innovation methods, particularly the learning algorithm, are worth exploring.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 5","pages":"Pages 395-406"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71728992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.vrih.2022.12.001
Zeyuan Cai , Zhiquan Feng , Liran Zhou , Xiaohui Yang , Tao Xu
Background
Robot grasping encompasses a wide range of research areas; however, most studies have been focused on the grasping of only stationary objects in a scene; only a few studies on how to grasp objects from a user's hand have been conducted. In this paper, a robot grasping algorithm based on deep reinforcement learning (RGRL) is proposed.
Methods
The RGRL takes the relative positions of the robot and the object in a user's hand as input and outputs the best action of the robot in the current state. Thus, the proposed algorithm realizes the functions of autonomous path planning and grasping objects safely from the hands of users. A new method for improving the safety of human–robot cooperation is explored. To solve the problems of a low utilization rate and slow convergence of reinforcement learning algorithms, the RGRL is first trained in a simulation scene, and then, the model parameters are applied to a real scene. To reduce the difference between the simulated and real scenes, domain randomization is applied to randomly change the positions and angles of objects in the simulated scenes at regular intervals, thereby improving the diversity of the training samples and robustness of the algorithm.
Results
The RGRL's effectiveness and accuracy are verified by evaluating it on both simulated and real scenes, and the results show that the RGRL can achieve an accuracy of more than 80% in both cases.
Conclusions
RGRL is a robot grasping algorithm that employs domain randomization and deep reinforcement learning for effective grasping in simulated and real scenes. However, it lacks flexibility in adapting to different grasping poses, prompting future research in achieving safe grasping for diverse user postures.
{"title":"Deep-reinforcement-learning-based robot motion strategies for grabbing objects from human hands","authors":"Zeyuan Cai , Zhiquan Feng , Liran Zhou , Xiaohui Yang , Tao Xu","doi":"10.1016/j.vrih.2022.12.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.12.001","url":null,"abstract":"<div><h3>Background</h3><p>Robot grasping encompasses a wide range of research areas; however, most studies have been focused on the grasping of only stationary objects in a scene; only a few studies on how to grasp objects from a user's hand have been conducted. In this paper, a robot grasping algorithm based on deep reinforcement learning (RGRL) is proposed.</p></div><div><h3>Methods</h3><p>The RGRL takes the relative positions of the robot and the object in a user's hand as input and outputs the best action of the robot in the current state. Thus, the proposed algorithm realizes the functions of autonomous path planning and grasping objects safely from the hands of users. A new method for improving the safety of human–robot cooperation is explored. To solve the problems of a low utilization rate and slow convergence of reinforcement learning algorithms, the RGRL is first trained in a simulation scene, and then, the model parameters are applied to a real scene. To reduce the difference between the simulated and real scenes, domain randomization is applied to randomly change the positions and angles of objects in the simulated scenes at regular intervals, thereby improving the diversity of the training samples and robustness of the algorithm.</p></div><div><h3>Results</h3><p>The RGRL's effectiveness and accuracy are verified by evaluating it on both simulated and real scenes, and the results show that the RGRL can achieve an accuracy of more than 80% in both cases.</p></div><div><h3>Conclusions</h3><p>RGRL is a robot grasping algorithm that employs domain randomization and deep reinforcement learning for effective grasping in simulated and real scenes. However, it lacks flexibility in adapting to different grasping poses, prompting future research in achieving safe grasping for diverse user postures.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 5","pages":"Pages 407-421"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71728993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.vrih.2023.07.001
Kangyu Wang , Yangqiu Yan , Hao Zhang , Xiaolong Liu , Lili Wang
We propose an eye-shaped keyboard for high-speed text entry in virtual reality (VR), having the shape of dual eyes with characters arranged along the curved eyelids, which ensures low density and short spacing of the keys. The eye-shaped keyboard references the QWERTY key sequence, allowing the users to benefit from their experience using the QWERTY keyboard. The user interacts with an eye-shaped keyboard using rays controlled with both the hands. A character can be entered in one step by moving the rays from the inner eye regions to regions of the characters. A high-speed auto-complete system was designed for the eye-shaped keyboard. We conducted a pilot study to determine the optimal parameters, and a user study to compare our eye-shaped keyboard with the QWERTY and circular keyboards. For beginners, the eye-shaped keyboard performed significantly more efficiently and accurately with less task load and hand movement than the circular keyboard. Compared with the QWERTY keyboard, the eye-shaped keyboard is more accurate and significantly reduces hand translation while maintaining similar efficiency. Finally, to evaluate the potential of eye-shaped keyboards, we conducted another user study. In this study, the participants were asked to type continuously for three days using the proposed eye-shaped keyboard, with two sessions per day. In each session, participants were asked to type for 20min, and then their typing performance was tested. The eye-shaped keyboard was proven to be efficient and promising, with an average speed of 19.89 words per minute (WPM) and mean uncorrected error rate of 1.939%. The maximum speed reached 24.97 WPM after six sessions and continued to increase.
{"title":"Eye-shaped keyboard for dual-hand text entry in virtual reality","authors":"Kangyu Wang , Yangqiu Yan , Hao Zhang , Xiaolong Liu , Lili Wang","doi":"10.1016/j.vrih.2023.07.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.07.001","url":null,"abstract":"<div><p>We propose an eye-shaped keyboard for high-speed text entry in virtual reality (VR), having the shape of dual eyes with characters arranged along the curved eyelids, which ensures low density and short spacing of the keys. The eye-shaped keyboard references the QWERTY key sequence, allowing the users to benefit from their experience using the QWERTY keyboard. The user interacts with an eye-shaped keyboard using rays controlled with both the hands. A character can be entered in one step by moving the rays from the inner eye regions to regions of the characters. A high-speed auto-complete system was designed for the eye-shaped keyboard. We conducted a pilot study to determine the optimal parameters, and a user study to compare our eye-shaped keyboard with the QWERTY and circular keyboards. For beginners, the eye-shaped keyboard performed significantly more efficiently and accurately with less task load and hand movement than the circular keyboard. Compared with the QWERTY keyboard, the eye-shaped keyboard is more accurate and significantly reduces hand translation while maintaining similar efficiency. Finally, to evaluate the potential of eye-shaped keyboards, we conducted another user study. In this study, the participants were asked to type continuously for three days using the proposed eye-shaped keyboard, with two sessions per day. In each session, participants were asked to type for 20min, and then their typing performance was tested. The eye-shaped keyboard was proven to be efficient and promising, with an average speed of 19.89 words per minute (WPM) and mean uncorrected error rate of 1.939%. The maximum speed reached 24.97 WPM after six sessions and continued to increase.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 5","pages":"Pages 451-469"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71729298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid development of Web3D, virtual reality, and digital twins, virtual trajectories and decision data considerably rely on the analysis and understanding of real video data, particularly in emergency evacuation scenarios. Correctly and effectively evacuating crowds in virtual emergency scenarios are becoming increasingly urgent. One good solution is to extract pedestrian trajectories from videos of emergency situations using a multi-target tracking algorithm and use them to define evacuation procedures.
Methods
To implement this solution, a trajectory extraction and optimization framework based on multi-target tracking is developed in this study. First, a multi-target tracking algorithm is used to extract and preprocess the trajectory data of the crowd in a video. Then, the trajectory is optimized by combining the trajectory point extraction algorithm and Savitzky–Golay smoothing filtering method. Finally, related experiments are conducted, and the results show that the proposed approach can effectively and accurately extract the trajectories of multiple target objects in real time.
Results
In addition, the proposed approach retains the real characteristics of the trajectories as much as possible while improving the trajectory smoothing index, which can provide data support for the analysis of pedestrian trajectory data and formulation of personnel evacuation schemes in emergency scenarios.
Conclusions
Further comparisons with methods used in related studies confirm the feasibility and superiority of the proposed framework.
{"title":"Novel learning framework for optimal multi-object video trajectory tracking","authors":"Siyuan Chen, Xiaowu Hu, Wenying Jiang, Wen Zhou, Xintao Ding","doi":"10.1016/j.vrih.2023.04.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.04.001","url":null,"abstract":"<div><h3>Background</h3><p>With the rapid development of Web3D, virtual reality, and digital twins, virtual trajectories and decision data considerably rely on the analysis and understanding of real video data, particularly in emergency evacuation scenarios. Correctly and effectively evacuating crowds in virtual emergency scenarios are becoming increasingly urgent. One good solution is to extract pedestrian trajectories from videos of emergency situations using a multi-target tracking algorithm and use them to define evacuation procedures.</p></div><div><h3>Methods</h3><p>To implement this solution, a trajectory extraction and optimization framework based on multi-target tracking is developed in this study. First, a multi-target tracking algorithm is used to extract and preprocess the trajectory data of the crowd in a video. Then, the trajectory is optimized by combining the trajectory point extraction algorithm and Savitzky–Golay smoothing filtering method. Finally, related experiments are conducted, and the results show that the proposed approach can effectively and accurately extract the trajectories of multiple target objects in real time.</p></div><div><h3>Results</h3><p>In addition, the proposed approach retains the real characteristics of the trajectories as much as possible while improving the trajectory smoothing index, which can provide data support for the analysis of pedestrian trajectory data and formulation of personnel evacuation schemes in emergency scenarios.</p></div><div><h3>Conclusions</h3><p>Further comparisons with methods used in related studies confirm the feasibility and superiority of the proposed framework.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 5","pages":"Pages 422-438"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71728994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1016/j.vrih.2022.04.002
Geng Yu , Chang Liu , Ting Fang , Jinyuan Jia , Enming Lin , Yiqiang He , Siyuan Fu , Long Wang , Lei Wei , Qingyu Huang
Background
In recent years, with the rapid development of mobile Internet and Web3D technologies, a large number of web-based online 3D visualization applications have emerged. Web3D applications, including Web3D online tourism, Web3D online architecture, Web3D online education environment, Web3D online medical care, and Web3D online shopping are examples of these applications that leverage 3D rendering on the web. These applications have pushed the boundaries of traditional web applications that use text, sound, image, video, and 2D animation as their main communication media, and resorted to 3D virtual scenes as the main interaction object, enabling a user experience that delivers a strong sense of immersion. This paper approached the emerging Web3D applications that generate stronger impacts on people's lives through “real-time rendering technology”, which is the core technology of Web3D. This paper discusses all the major 3D graphics APIs of Web3D and the well-known Web3D engines at home and abroad and classify the real-time rendering frameworks of Web3D applications into different categories.
Results
Finally, this study analyzed the specific demand posed by different fields to Web3D applications by referring to the representative Web3D applications in each particular field.
Conclusions
Our survey results show that Web3D applications based on real-time rendering have in-depth sectors of society and even family, which is a trend that has influence on every line of industry.
背景近年来,随着移动互联网和Web3D技术的快速发展,出现了大量基于web的在线三维可视化应用。Web3D应用程序,包括Web3D在线旅游、Web3D在线体系结构、Web3D联机教育环境、Web3D online medical care和Web3D online shopping,都是利用web上3D渲染的这些应用程序的示例。这些应用程序突破了传统网络应用程序的界限,传统网络应用将文本、声音、图像、视频和2D动画作为其主要通信媒体,并将3D虚拟场景作为主要交互对象,从而实现了一种具有强烈沉浸感的用户体验。本文通过Web3D的核心技术“实时渲染技术”来探讨新兴的对人们生活产生更大影响的Web3D应用。本文讨论了Web3D的所有主要三维图形API和国内外著名的Web3D引擎,并将Web3D应用程序的实时渲染框架分为不同的类别。结果最后,本研究通过参考每个特定领域中具有代表性的Web3D应用,分析了不同领域对Web3D应用提出的具体需求。结论我们的调查结果表明,基于实时渲染的Web3D应用深入社会甚至家庭,这是一种影响各行各业的趋势。
{"title":"A survey of real-time rendering on Web3D application","authors":"Geng Yu , Chang Liu , Ting Fang , Jinyuan Jia , Enming Lin , Yiqiang He , Siyuan Fu , Long Wang , Lei Wei , Qingyu Huang","doi":"10.1016/j.vrih.2022.04.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.04.002","url":null,"abstract":"<div><h3>Background</h3><p>In recent years, with the rapid development of mobile Internet and Web3D technologies, a large number of web-based online 3D visualization applications have emerged. Web3D applications, including Web3D online tourism, Web3D online architecture, Web3D online education environment, Web3D online medical care, and Web3D online shopping are examples of these applications that leverage 3D rendering on the web. These applications have pushed the boundaries of traditional web applications that use text, sound, image, video, and 2D animation as their main communication media, and resorted to 3D virtual scenes as the main interaction object, enabling a user experience that delivers a strong sense of immersion. This paper approached the emerging Web3D applications that generate stronger impacts on people's lives through “real-time rendering technology”, which is the core technology of Web3D. This paper discusses all the major 3D graphics APIs of Web3D and the well-known Web3D engines at home and abroad and classify the real-time rendering frameworks of Web3D applications into different categories.</p></div><div><h3>Results</h3><p>Finally, this study analyzed the specific demand posed by different fields to Web3D applications by referring to the representative Web3D applications in each particular field.</p></div><div><h3>Conclusions</h3><p>Our survey results show that Web3D applications based on real-time rendering have in-depth sectors of society and even family, which is a trend that has influence on every line of industry.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 5","pages":"Pages 379-394"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71728991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-01DOI: 10.1016/j.vrih.2022.08.010
Xiaoyan Hu, Xizhao Bao, Guoli Wei, Zhaoyu Li
Background
In computer vision, simultaneously estimating human pose, shape, and clothing is a practical issue in real life, but remains a challenging task owing to the variety of clothing, complexity of deformation, shortage of large-scale datasets, and difficulty in estimating clothing style.
Methods
We propose a multistage weakly supervised method that makes full use of data with less labeled information for learning to estimate human body shape, pose, and clothing deformation. In the first stage, the SMPL human-body model parameters were regressed using the multi-view 2D key points of the human body. Using multi-view information as weakly supervised information can avoid the deep ambiguity problem of a single view, obtain a more accurate human posture, and access supervisory information easily. In the second stage, clothing is represented by a PCAbased model that uses two-dimensional key points of clothing as supervised information to regress the parameters. In the third stage, we predefine an embedding graph for each type of clothing to describe the deformation. Then, the mask information of the clothing is used to further adjust the deformation of the clothing. To facilitate training, we constructed a multi-view synthetic dataset that included BCNet and SURREAL.
Results
The Experiments show that the accuracy of our method reaches the same level as that of SOTA methods using strong supervision information while only using weakly supervised information. Because this study uses only weakly supervised information, which is much easier to obtain, it has the advantage of utilizing existing data as training data. Experiments on the DeepFashion2 dataset show that our method can make full use of the existing weak supervision information for fine-tuning on a dataset with little supervision information, compared with the strong supervision information that cannot be trained or adjusted owing to the lack of exact annotation information.
Conclusions
Our weak supervision method can accurately estimate human body size, pose, and several common types of clothing and overcome the issues of the current shortage of clothing data.
{"title":"Human-pose estimation based on weak supervision","authors":"Xiaoyan Hu, Xizhao Bao, Guoli Wei, Zhaoyu Li","doi":"10.1016/j.vrih.2022.08.010","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.08.010","url":null,"abstract":"<div><h3>Background</h3><p>In computer vision, simultaneously estimating human pose, shape, and clothing is a practical issue in real life, but remains a challenging task owing to the variety of clothing, complexity of deformation, shortage of large-scale datasets, and difficulty in estimating clothing style.</p></div><div><h3>Methods</h3><p>We propose a multistage weakly supervised method that makes full use of data with less labeled information for learning to estimate human body shape, pose, and clothing deformation. In the first stage, the SMPL human-body model parameters were regressed using the multi-view 2D key points of the human body. Using multi-view information as weakly supervised information can avoid the deep ambiguity problem of a single view, obtain a more accurate human posture, and access supervisory information easily. In the second stage, clothing is represented by a PCAbased model that uses two-dimensional key points of clothing as supervised information to regress the parameters. In the third stage, we predefine an embedding graph for each type of clothing to describe the deformation. Then, the mask information of the clothing is used to further adjust the deformation of the clothing. To facilitate training, we constructed a multi-view synthetic dataset that included BCNet and SURREAL.</p></div><div><h3>Results</h3><p>The Experiments show that the accuracy of our method reaches the same level as that of SOTA methods using strong supervision information while only using weakly supervised information. Because this study uses only weakly supervised information, which is much easier to obtain, it has the advantage of utilizing existing data as training data. Experiments on the DeepFashion2 dataset show that our method can make full use of the existing weak supervision information for fine-tuning on a dataset with little supervision information, compared with the strong supervision information that cannot be trained or adjusted owing to the lack of exact annotation information.</p></div><div><h3>Conclusions</h3><p>Our weak supervision method can accurately estimate human body size, pose, and several common types of clothing and overcome the issues of the current shortage of clothing data.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 4","pages":"Pages 366-377"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49848597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-01DOI: 10.1016/j.vrih.2022.08.017
Xiangyuan Liu, Zhongke Wu, Xingce Wang
Image denoising is an important topic in the digital image processing field. This paper theoretically studies the validity of the classical non-local mean filter (NLM) for removing Gaussian noise from a novel statistic perspective. By regarding the restored image as an estimator of the clear image from the statistical view, we gradually analyse the unbiasedness and effectiveness of the restored value obtained by the NLM filter. Then, we propose an improved NLM algorithm called the clustering-based NLM filter (CNLM) that derived from the conditions obtained through the theoretical analysis. The proposed filter attempts to restore an ideal value using the approximately constant intensities obtained by the image clustering process. Here, we adopt a mixed probability model on a prefiltered image to generate an estimator of the ideal clustered components. The experimental results show that our algorithm obtains considerable improvement in peak signal-to-noise ratio (PSNR) values and visual results when removing Gaussian noise. On the other hand, the considerable practical performance of our filter shows that our method is theoretically acceptable as it can effectively estimates ideal images.
{"title":"The validity analysis of the non-local mean filter and a derived novel denoising method","authors":"Xiangyuan Liu, Zhongke Wu, Xingce Wang","doi":"10.1016/j.vrih.2022.08.017","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.08.017","url":null,"abstract":"<div><p>Image denoising is an important topic in the digital image processing field. This paper theoretically studies the validity of the classical non-local mean filter (NLM) for removing Gaussian noise from a novel statistic perspective. By regarding the restored image as an estimator of the clear image from the statistical view, we gradually analyse the unbiasedness and effectiveness of the restored value obtained by the NLM filter. Then, we propose an improved NLM algorithm called the clustering-based NLM filter (CNLM) that derived from the conditions obtained through the theoretical analysis. The proposed filter attempts to restore an ideal value using the approximately constant intensities obtained by the image clustering process. Here, we adopt a mixed probability model on a prefiltered image to generate an estimator of the ideal clustered components. The experimental results show that our algorithm obtains considerable improvement in peak signal-to-noise ratio (PSNR) values and visual results when removing Gaussian noise. On the other hand, the considerable practical performance of our filter shows that our method is theoretically acceptable as it can effectively estimates ideal images.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 4","pages":"Pages 338-350"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49897113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-01DOI: 10.1016/j.vrih.2022.07.008
Lurong Yang , Zhiquan Feng , Junhong Meng
Background
At present, the teaching of experiments in primary and secondary schools is affected by cost and security factors. The existing research on virtual-experiment platforms alleviates this problem. However, the lack of real experimental equipment and the use of a single channel to understand users’ intentions weaken these platforms operationally and degrade the naturalness of interactions. To slove the above problems,we propose an intelligent experimental container structure and a situational awareness algorithm,both of which are verified and then applied to a chemical experiment involving virtual-real fusion. First, acquired images are denoised in the visual channel, using maximum diffuse reflection chroma to remove overexposures. Second, container situational awareness is realized by segmenting the image liquid level and establishing a relation-fitting model. Then, strategies for constructing complete behaviors and making priority comparisons among behaviors are adopted for information complementarity and information independence, respectively. A multichannel intentional understanding model and an interactive paradigm fusing vision, hearing and touch are proposed. The results show that the designed experimental container and algorithm in a virtual chemical experiment platform can achieve a natural level of human-computer interaction, enhance the user's sense of operation, and achieve high user satisfaction.
{"title":"An intelligent experimental container suite: using a chemical experiment with virtual-real fusion as an example","authors":"Lurong Yang , Zhiquan Feng , Junhong Meng","doi":"10.1016/j.vrih.2022.07.008","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.07.008","url":null,"abstract":"<div><h3>Background</h3><p>At present, the teaching of experiments in primary and secondary schools is affected by cost and security factors. The existing research on virtual-experiment platforms alleviates this problem. However, the lack of real experimental equipment and the use of a single channel to understand users’ intentions weaken these platforms operationally and degrade the naturalness of interactions. To slove the above problems,we propose an intelligent experimental container structure and a situational awareness algorithm,both of which are verified and then applied to a chemical experiment involving virtual-real fusion. First, acquired images are denoised in the visual channel, using maximum diffuse reflection chroma to remove overexposures. Second, container situational awareness is realized by segmenting the image liquid level and establishing a relation-fitting model. Then, strategies for constructing complete behaviors and making priority comparisons among behaviors are adopted for information complementarity and information independence, respectively. A multichannel intentional understanding model and an interactive paradigm fusing vision, hearing and touch are proposed. The results show that the designed experimental container and algorithm in a virtual chemical experiment platform can achieve a natural level of human-computer interaction, enhance the user's sense of operation, and achieve high user satisfaction.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 4","pages":"Pages 317-337"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49848598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-08-01DOI: 10.1016/j.vrih.2022.08.015
Le Bi , Tingting Liu , Zhen Liu , Jason Teo , Yumeng Zhao , Yanjie Chai
In terrorist attack simulations, existing methods do not describe individual differences, which means different individuals will not have different behaviors. To address this problem, we propose a framework to model people’s heterogeneous behaviors in terrorist attack. For pedestrian, we construct an emotional model that takes into account its personality and visual perception. The emotional model is then combined with pedestrians' relationship networks to make the decision-making model. With the proposed decision-making model, pedestrian may have altruistic behaviors. For terrorist, a mapping model is developed to map its antisocial personality to its attacking strategy. The experiments show that the proposed algorithm can generate realistic heterogeneous behaviors that are consistent with existing psychological research findings.
{"title":"Modeling heterogeneous behaviors with different strategies in a terrorist attack","authors":"Le Bi , Tingting Liu , Zhen Liu , Jason Teo , Yumeng Zhao , Yanjie Chai","doi":"10.1016/j.vrih.2022.08.015","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.08.015","url":null,"abstract":"<div><p>In terrorist attack simulations, existing methods do not describe individual differences, which means different individuals will not have different behaviors. To address this problem, we propose a framework to model people’s heterogeneous behaviors in terrorist attack. For pedestrian, we construct an emotional model that takes into account its personality and visual perception. The emotional model is then combined with pedestrians' relationship networks to make the decision-making model. With the proposed decision-making model, pedestrian may have altruistic behaviors. For terrorist, a mapping model is developed to map its antisocial personality to its attacking strategy. The experiments show that the proposed algorithm can generate realistic heterogeneous behaviors that are consistent with existing psychological research findings.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 4","pages":"Pages 351-365"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49848596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}