Pub Date : 2024-04-01DOI: 10.1016/j.vrih.2023.09.001
Dengzhen Lu , Hengyi Li , Boyu Qiu , Siyuan Liu , Shuhan Qi
Background
Most existing chemical experiment teaching systems lack solid immersive experiences, making it difficult to engage students. To address these challenges, we propose a chemical simulation teaching system based on virtual reality and gesture interaction.
Methods
The parameters of the models were obtained through actual investigation, whereby Blender and 3DS MAX were used to model and import these parameters into a physics engine. By establishing an interface for the physics engine, gesture interaction hardware, and virtual reality (VR) helmet, a highly realistic chemical experiment environment was created. Using code script logic, particle systems, as well as other systems, chemical phenomena were simulated. Furthermore, we created an online teaching platform using streaming media and databases to address the problems of distance teaching.
Results
The proposed system was evaluated against two mainstream products in the market. In the experiments, the proposed system outperformed the other products in terms of fidelity and practicality.
Conclusions
The proposed system which offers realistic simulations and practicability, can help improve the high school chemistry experimental education.
背景现有的化学实验教学系统大多缺乏扎实的沉浸式体验,难以吸引学生。为了应对这些挑战,我们提出了基于虚拟现实和手势交互的化学模拟教学系统。方法通过实际调查获得模型参数,然后使用 Blender 和 3DS MAX 进行建模,并将这些参数导入物理引擎。通过为物理引擎、手势交互硬件和虚拟现实(VR)头盔建立接口,创建了一个高度逼真的化学实验环境。利用代码脚本逻辑、粒子系统以及其他系统,模拟了化学现象。此外,我们还利用流媒体和数据库创建了一个在线教学平台,以解决远程教学的问题。在实验中,所提出的系统在逼真度和实用性方面均优于其他产品。结论所提出的系统具有逼真的模拟效果和实用性,有助于改善高中化学实验教学。
{"title":"Chemical simulation teaching system based on virtual reality and gesture interaction","authors":"Dengzhen Lu , Hengyi Li , Boyu Qiu , Siyuan Liu , Shuhan Qi","doi":"10.1016/j.vrih.2023.09.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.09.001","url":null,"abstract":"<div><h3>Background</h3><p>Most existing chemical experiment teaching systems lack solid immersive experiences, making it difficult to engage students. To address these challenges, we propose a chemical simulation teaching system based on virtual reality and gesture interaction.</p></div><div><h3>Methods</h3><p>The parameters of the models were obtained through actual investigation, whereby Blender and 3DS MAX were used to model and import these parameters into a physics engine. By establishing an interface for the physics engine, gesture interaction hardware, and virtual reality (VR) helmet, a highly realistic chemical experiment environment was created. Using code script logic, particle systems, as well as other systems, chemical phenomena were simulated. Furthermore, we created an online teaching platform using streaming media and databases to address the problems of distance teaching.</p></div><div><h3>Results</h3><p>The proposed system was evaluated against two mainstream products in the market. In the experiments, the proposed system outperformed the other products in terms of fidelity and practicality.</p></div><div><h3>Conclusions</h3><p>The proposed system which offers realistic simulations and practicability, can help improve the high school chemistry experimental education.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 2","pages":"Pages 148-168"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962300061X/pdf?md5=5a61efaff7176636efdb6c186ffcfa7d&pid=1-s2.0-S209657962300061X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140880273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.vrih.2024.02.002
Xiaoning Qiao , Wenming Xie , Xiaodong Peng , Guangyun Li , Dalin Li , Yingyi Guo , Jingyi Ren
Background
A task assigned to space exploration satellites involves detecting the physical environment within a certain space. However, space detection data are complex and abstract. These data are not conducive for researchers' visual perceptions of the evolution and interaction of events in the space environment.
Methods
A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time, and the corresponding relationships between data location features and other attribute features were established. A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data. The visualization process is optimized for rendering by merging materials, reducing the number of patches, and performing other operations.
Results
The results of sampling, feature extraction, and uniform visualization of the detection data of complex types, long duration spans, and uneven spatial distributions were obtained. The real-time visualization of large-scale spatial structures using augmented reality devices, particularly low-performance devices, was also investigated.
Conclusions
The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space, express the structure and changes in the spatial environment using augmented reality, and assist in intuitively discovering spatial environmental events and evolutionary rules.
{"title":"Large-scale spatial data visualization method based on augmented reality","authors":"Xiaoning Qiao , Wenming Xie , Xiaodong Peng , Guangyun Li , Dalin Li , Yingyi Guo , Jingyi Ren","doi":"10.1016/j.vrih.2024.02.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2024.02.002","url":null,"abstract":"<div><h3>Background</h3><p>A task assigned to space exploration satellites involves detecting the physical environment within a certain space. However, space detection data are complex and abstract. These data are not conducive for researchers' visual perceptions of the evolution and interaction of events in the space environment.</p></div><div><h3>Methods</h3><p>A time-series dynamic data sampling method for large-scale space was proposed for sample detection data in space and time, and the corresponding relationships between data location features and other attribute features were established. A tone-mapping method based on statistical histogram equalization was proposed and applied to the final attribute feature data. The visualization process is optimized for rendering by merging materials, reducing the number of patches, and performing other operations.</p></div><div><h3>Results</h3><p>The results of sampling, feature extraction, and uniform visualization of the detection data of complex types, long duration spans, and uneven spatial distributions were obtained. The real-time visualization of large-scale spatial structures using augmented reality devices, particularly low-performance devices, was also investigated.</p></div><div><h3>Conclusions</h3><p>The proposed visualization system can reconstruct the three-dimensional structure of a large-scale space, express the structure and changes in the spatial environment using augmented reality, and assist in intuitively discovering spatial environmental events and evolutionary rules.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 2","pages":"Pages 132-147"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579624000081/pdf?md5=340d5b042587b27ec24ac9e75b5af9d0&pid=1-s2.0-S2096579624000081-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140880272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1016/j.vrih.2023.08.006
Lichao Niu , Wenjun Xie , Dong Wang , Zhongrui Cao , Xiaoping Liu
Background
Considerable research has been conducted in the areas of audio-driven virtual character gestures and facial animation with some degree of success. However, few methods exist for generating full-body animations, and the portability of virtual character gestures and facial animations has not received sufficient attention.
Methods
Therefore, we propose a deep-learning-based audio-to-animation-and-blendshape (Audio2AB) network that generates gesture animations andARK it’s 52 facial expression parameter blendshape weights based on audio, audio-corresponding text, emotion labels, and semantic relevance labels to generate parametric data for full- body animations. This parameterization method can be used to drive full-body animations of virtual characters and improve their portability. In the experiment, we first downsampled the gesture and facial data to achieve the same temporal resolution for the input, output, and facial data. The Audio2AB network then encoded the audio, audio- corresponding text, emotion labels, and semantic relevance labels, and then fused the text, emotion labels, and semantic relevance labels into the audio to obtain better audio features. Finally, we established links between the body, gestures, and facial decoders and generated the corresponding animation sequences through our proposed GAN-GF loss function.
Results
By using audio, audio-corresponding text, and emotional and semantic relevance labels as input, the trained Audio2AB network could generate gesture animation data containing blendshape weights. Therefore, different 3D virtual character animations could be created through parameterization.
Conclusions
The experimental results showed that the proposed method could generate significant gestures and facial animations.
{"title":"Audio2AB: Audio-driven collaborative generation of virtual character animation","authors":"Lichao Niu , Wenjun Xie , Dong Wang , Zhongrui Cao , Xiaoping Liu","doi":"10.1016/j.vrih.2023.08.006","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.08.006","url":null,"abstract":"<div><h3>Background</h3><p>Considerable research has been conducted in the areas of audio-driven virtual character gestures and facial animation with some degree of success. However, few methods exist for generating full-body animations, and the portability of virtual character gestures and facial animations has not received sufficient attention.</p></div><div><h3>Methods</h3><p>Therefore, we propose a deep-learning-based audio-to-animation-and-blendshape (Audio2AB) network that generates gesture animations andARK it’s 52 facial expression parameter blendshape weights based on audio, audio-corresponding text, emotion labels, and semantic relevance labels to generate parametric data for full- body animations. This parameterization method can be used to drive full-body animations of virtual characters and improve their portability. In the experiment, we first downsampled the gesture and facial data to achieve the same temporal resolution for the input, output, and facial data. The Audio2AB network then encoded the audio, audio- corresponding text, emotion labels, and semantic relevance labels, and then fused the text, emotion labels, and semantic relevance labels into the audio to obtain better audio features. Finally, we established links between the body, gestures, and facial decoders and generated the corresponding animation sequences through our proposed GAN-GF loss function.</p></div><div><h3>Results</h3><p>By using audio, audio-corresponding text, and emotional and semantic relevance labels as input, the trained Audio2AB network could generate gesture animation data containing blendshape weights. Therefore, different 3D virtual character animations could be created through parameterization.</p></div><div><h3>Conclusions</h3><p>The experimental results showed that the proposed method could generate significant gestures and facial animations.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 1","pages":"Pages 56-70"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000578/pdf?md5=643d5833200a7e29b7c69fe6f55dfabf&pid=1-s2.0-S2096579623000578-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1016/j.vrih.2023.08.007
Dvir Ginzburg, Dan Raviv
Background
Functional mapping, despite its proven efficiency, suffers from a “chicken or egg” sce- nario, in that, poor spatial features lead to inadequate spectral alignment and vice versa during training, often resulting in slow convergence, high computational costs, and learning failures, particularly when small datasets are used.
Methods
A novel method is presented for dense-shape correspondence, whereby the spatial information transformed by neural networks is combined with the projections onto spectral maps to overcome the “chicken or egg” challenge by selectively sampling only points with high confidence in their alignment. These points then contribute to the alignment and spectral loss terms, boosting training, and accelerating convergence by a factor of five. To ensure full unsupervised learning, the Gromov–Hausdorff distance metric was used to select the points with the maximal alignment score displaying most confidence.
Results
The effectiveness of the proposed approach was demonstrated on several benchmark datasets, whereby results were reported as superior to those of spectral and spatial-based methods.
Conclusions
The proposed method provides a promising new approach to dense-shape correspondence, addressing the key challenges in the field and offering significant advantages over the current methods, including faster convergence, improved accuracy, and reduced computational costs.
{"title":"Selective sampling with Gromov–Hausdorff metric: Efficient dense-shape correspondence via Confidence-based sample consensus","authors":"Dvir Ginzburg, Dan Raviv","doi":"10.1016/j.vrih.2023.08.007","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.08.007","url":null,"abstract":"<div><h3>Background</h3><p>Functional mapping, despite its proven efficiency, suffers from a “chicken or egg” sce- nario, in that, poor spatial features lead to inadequate spectral alignment and vice versa during training, often resulting in slow convergence, high computational costs, and learning failures, particularly when small datasets are used.</p></div><div><h3>Methods</h3><p>A novel method is presented for dense-shape correspondence, whereby the spatial information transformed by neural networks is combined with the projections onto spectral maps to overcome the “chicken or egg” challenge by selectively sampling only points with high confidence in their alignment. These points then contribute to the alignment and spectral loss terms, boosting training, and accelerating convergence by a factor of five. To ensure full unsupervised learning, the <em>Gromov–Hausdorff distance metric</em> was used to select the points with the maximal alignment score displaying most confidence.</p></div><div><h3>Results</h3><p>The effectiveness of the proposed approach was demonstrated on several benchmark datasets, whereby results were reported as superior to those of spectral and spatial-based methods.</p></div><div><h3>Conclusions</h3><p>The proposed method provides a promising new approach to dense-shape correspondence, addressing the key challenges in the field and offering significant advantages over the current methods, including faster convergence, improved accuracy, and reduced computational costs.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 1","pages":"Pages 30-42"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962300058X/pdf?md5=0d72c2ce81fa69712b18835a2698ec47&pid=1-s2.0-S209657962300058X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1016/j.vrih.2023.08.005
Mingjian Li , Younhyun Jung , Michael Fulham , Jinman Kim
Background
A medical content-based image retrieval (CBIR) system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image. CBIR is widely used in evidence- based diagnosis, teaching, and research. Although the retrieval accuracy has largely improved, there has been limited development toward visualizing important image features that indicate the similarity of retrieved images. Despite the prevalence of3D volumetric data in medical imaging such as computed tomography (CT), current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images. Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information, including the size, shape, and spatial relations of multiple structures. This process is time-consuming and reliant on users’ experience.
Methods
In this study, we proposed an importance-aware 3D volume visualization method. The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process. We then integrated the proposed visualization into a CBIR system, thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.
Results
Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography (PET- CT) images of a non-small cell lung cancer dataset.
{"title":"Importance-aware 3D volume visualization for medical content-based image retrieval-a preliminary study","authors":"Mingjian Li , Younhyun Jung , Michael Fulham , Jinman Kim","doi":"10.1016/j.vrih.2023.08.005","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.08.005","url":null,"abstract":"<div><h3>Background</h3><p>A medical content-based image retrieval (CBIR) system is designed to retrieve images from large imaging repositories that are visually similar to a user′s query image. CBIR is widely used in evidence- based diagnosis, teaching, and research. Although the retrieval accuracy has largely improved, there has been limited development toward visualizing important image features that indicate the similarity of retrieved images. Despite the prevalence of3D volumetric data in medical imaging such as computed tomography (CT), current CBIR systems still rely on 2D cross-sectional views for the visualization of retrieved images. Such 2D visualization requires users to browse through the image stacks to confirm the similarity of the retrieved images and often involves mental reconstruction of 3D information, including the size, shape, and spatial relations of multiple structures. This process is time-consuming and reliant on users’ experience.</p></div><div><h3>Methods</h3><p>In this study, we proposed an importance-aware 3D volume visualization method. The rendering parameters were automatically optimized to maximize the visibility of important structures that were detected and prioritized in the retrieval process. We then integrated the proposed visualization into a CBIR system, thereby complementing the 2D cross-sectional views for relevance feedback and further analyses.</p></div><div><h3>Results</h3><p>Our preliminary results demonstrate that 3D visualization can provide additional information using multimodal positron emission tomography and computed tomography (PET- CT) images of a non-small cell lung cancer dataset.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 1","pages":"Pages 71-81"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000566/pdf?md5=771df0097b94f27ef3ca76e8f800722b&pid=1-s2.0-S2096579623000566-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wireless sensor networks (WSN) gather information and sense information samples in a certain region and communicate these readings to a base station (BS). Energy efficiency is considered a major design issue in the WSNs, and can be addressed using clustering and routing techniques. Information is sent from the source to the BS via routing procedures. However, these routing protocols must ensure that packets are delivered securely, guar- anteeing that neither adversaries nor unauthentic individuals have access to the sent information. Secure data transfer is intended to protect the data from illegal access, damage, or disruption. Thus, in the proposed model, secure data transmission is developed in an energy-effective manner. A low-energy adaptive clustering hierarchy (LEACH) is developed to efficiently transfer the data. For the intrusion detection systems (IDS), Fuzzy logic and artificial neural networks (ANNs) are proposed. Initially, the nodes were randomly placed in the network and initialized to gather information. To ensure fair energy dissipation between the nodes, LEACH randomly chooses cluster heads (CHs) and allocates this role to the various nodes based on a round-robin management mechanism. The intrusion-detection procedure was then utilized to determine whether intruders were present in the network. Within the WSN, a Fuzzy interference rule was utilized to distinguish the malicious nodes from legal nodes. Subsequently, an ANN was employed to distinguish the harmful nodes from suspicious nodes. The effectiveness of the proposed approach was validated using metrics that attained 97% accuracy, 97% specificity, and 97% sensitivity of 95%. Thus, it was proved that the LEACH and Fuzzy-based IDS approaches are the best choices for securing data transmission in an energy-efficient manner.
{"title":"Effective data transmission through energy-efficient clus- tering and Fuzzy-Based IDS routing approach in WSNs","authors":"Saziya Tabbassum (Research Scholar) , Rajesh Kumar Pathak (Vice Chancellor)","doi":"10.1016/j.vrih.2022.10.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2022.10.002","url":null,"abstract":"<div><p>Wireless sensor networks (WSN) gather information and sense information samples in a certain region and communicate these readings to a base station (BS). Energy efficiency is considered a major design issue in the WSNs, and can be addressed using clustering and routing techniques. Information is sent from the source to the BS via routing procedures. However, these routing protocols must ensure that packets are delivered securely, guar- anteeing that neither adversaries nor unauthentic individuals have access to the sent information. Secure data transfer is intended to protect the data from illegal access, damage, or disruption. Thus, in the proposed model, secure data transmission is developed in an energy-effective manner. A low-energy adaptive clustering hierarchy (LEACH) is developed to efficiently transfer the data. For the intrusion detection systems (IDS), Fuzzy logic and artificial neural networks (ANNs) are proposed. Initially, the nodes were randomly placed in the network and initialized to gather information. To ensure fair energy dissipation between the nodes, LEACH randomly chooses cluster heads (CHs) and allocates this role to the various nodes based on a round-robin management mechanism. The intrusion-detection procedure was then utilized to determine whether intruders were present in the network. Within the WSN, a Fuzzy interference rule was utilized to distinguish the malicious nodes from legal nodes. Subsequently, an ANN was employed to distinguish the harmful nodes from suspicious nodes. The effectiveness of the proposed approach was validated using metrics that attained 97% accuracy, 97% specificity, and 97% sensitivity of 95%. Thus, it was proved that the LEACH and Fuzzy-based IDS approaches are the best choices for securing data transmission in an energy-efficient manner.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 1","pages":"Pages 1-16"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579622001139/pdf?md5=33169ccdb2fe0c8e8a08f569df224af6&pid=1-s2.0-S2096579622001139-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1016/j.vrih.2023.08.001
Fei Li , Zhibao Qin , Kai Qian , Shaojun Liang , Chengli Li , Yonghang Tai
Background
Virtual reality technology has been widely used in surgical simulators, providing new opportunities for assessing and training surgical skills. Machine learning algorithms are commonly used to analyze and evaluate the performance of participants. However, their interpretability limits the personalization of the training for individual participants.
Methods
Seventy-nine participants were recruited and divided into three groups based on their skill level in intracranial tumor resection. Data on the use of surgical tools were collected using a surgical simulator. Feature selection was performed using the Minimum Redundancy Maximum Relevance and SVM-RFE algorithms to obtain the final metrics for training the machine learning model. Five machine learning algorithms were trained to predict the skill level, and the support vector machine performed the best, with an accuracy of 92.41% and Area Under Curve value of0.98253. The machine learning model was interpreted using Shapley values to identify the important factors contributing to the skill level of each participant.
Results
This study demonstrates the effectiveness of machine learning in differentiating the evaluation and training of virtual reality neurosurgical per- formances. The use of Shapley values enables targeted training by identifying deficiencies in individual skills.
Conclusions
This study provides insights into the use of machine learning for personalized training in virtual reality neurosurgery. The interpretability of the machine learning models enables the development of individualized training programs. In addition, this study highlighted the potential of explanatory models in training external skills.
{"title":"Personalized assessment and training of neurosurgical skills in virtual reality: An interpretable machine learning approach","authors":"Fei Li , Zhibao Qin , Kai Qian , Shaojun Liang , Chengli Li , Yonghang Tai","doi":"10.1016/j.vrih.2023.08.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.08.001","url":null,"abstract":"<div><h3>Background</h3><p>Virtual reality technology has been widely used in surgical simulators, providing new opportunities for assessing and training surgical skills. Machine learning algorithms are commonly used to analyze and evaluate the performance of participants. However, their interpretability limits the personalization of the training for individual participants.</p></div><div><h3>Methods</h3><p>Seventy-nine participants were recruited and divided into three groups based on their skill level in intracranial tumor resection. Data on the use of surgical tools were collected using a surgical simulator. Feature selection was performed using the Minimum Redundancy Maximum Relevance and SVM-RFE algorithms to obtain the final metrics for training the machine learning model. Five machine learning algorithms were trained to predict the skill level, and the support vector machine performed the best, with an accuracy of 92.41% and Area Under Curve value of0.98253. The machine learning model was interpreted using Shapley values to identify the important factors contributing to the skill level of each participant.</p></div><div><h3>Results</h3><p>This study demonstrates the effectiveness of machine learning in differentiating the evaluation and training of virtual reality neurosurgical per- formances. The use of Shapley values enables targeted training by identifying deficiencies in individual skills.</p></div><div><h3>Conclusions</h3><p>This study provides insights into the use of machine learning for personalized training in virtual reality neurosurgery. The interpretability of the machine learning models enables the development of individualized training programs. In addition, this study highlighted the potential of explanatory models in training external skills.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 1","pages":"Pages 17-29"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000451/pdf?md5=4a05396e17452331858ce0f3bf7464a8&pid=1-s2.0-S2096579623000451-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.1016/j.vrih.2023.07.002
Minghua Jiang , Zhangyuan Tian , Chenyu Yu , Yankang Shi , Li Liu , Tao Peng , Xinrong Hu , Feng Yu
Background
Intelligent garments, a burgeoning class of wearable devices, have extensive applications in domains such as sports training and medical rehabilitation. Nonetheless, existing research in the smart wearables domain predominantly emphasizes sensor functionality and quantity, often skipping crucial aspects related to user experience and interaction.
Methods
To address this gap, this study introduces a novel real-time 3D interactive system based on intelligent garments. The system utilizes lightweight sensor modules to collect human motion data and introduces a dual-stream fusion network based on pulsed neural units to classify and recognize human movements, thereby achieving real-time interaction between users and sensors. Additionally, the system in- corporates 3D human visualization functionality, which visualizes sensor data and recognizes human actions as 3D models in realtime, providing accurate and comprehensive visual feedback to help users better understand and analyze the details and features of human motion. This system has significant potential for applications in motion detection, medical monitoring, virtual reality, and other fields. The accurate classification of human actions con- tributes to the development of personalized training plans and injury prevention strategies.
Conclusions
This study has substantial implications in the domains of intelligent garments, human motion monitoring, and digital twin visualization. The advancement of this system is expected to propel the progress of wearable technology and foster a deeper comprehension of human motion.
{"title":"Intelligent 3D garment system of the human body based on deep spiking neural network","authors":"Minghua Jiang , Zhangyuan Tian , Chenyu Yu , Yankang Shi , Li Liu , Tao Peng , Xinrong Hu , Feng Yu","doi":"10.1016/j.vrih.2023.07.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.07.002","url":null,"abstract":"<div><h3>Background</h3><p>Intelligent garments, a burgeoning class of wearable devices, have extensive applications in domains such as sports training and medical rehabilitation. Nonetheless, existing research in the smart wearables domain predominantly emphasizes sensor functionality and quantity, often skipping crucial aspects related to user experience and interaction.</p></div><div><h3>Methods</h3><p>To address this gap, this study introduces a novel real-time 3D interactive system based on intelligent garments. The system utilizes lightweight sensor modules to collect human motion data and introduces a dual-stream fusion network based on pulsed neural units to classify and recognize human movements, thereby achieving real-time interaction between users and sensors. Additionally, the system in- corporates 3D human visualization functionality, which visualizes sensor data and recognizes human actions as 3D models in realtime, providing accurate and comprehensive visual feedback to help users better understand and analyze the details and features of human motion. This system has significant potential for applications in motion detection, medical monitoring, virtual reality, and other fields. The accurate classification of human actions con- tributes to the development of personalized training plans and injury prevention strategies.</p></div><div><h3>Conclusions</h3><p>This study has substantial implications in the domains of intelligent garments, human motion monitoring, and digital twin visualization. The advancement of this system is expected to propel the progress of wearable technology and foster a deeper comprehension of human motion.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 1","pages":"Pages 43-55"},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962300044X/pdf?md5=934866992a1e420fa2627cab1a89561d&pid=1-s2.0-S209657962300044X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139986017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1016/j.vrih.2022.08.009
Masanori Nakayama, Karin Uchino, Ken Nagao, Issei Fujishiro
In modern society, digital signage installed in many large-scale facilities supports our daily life. However, with a limited screen size, it is difficult to provide different types of information for many viewers at varying distances from the screen simultaneously. Therefore, in this study, we extend existing research on the use of hybrid images for tiled displays. To facilitate smoother information selection, a new interactive display method is proposed that incorporates a touchactivated widget as a high-frequency part of the hybrid image; these widgets are novel in that they are more visible to the viewers near to the display. We develop an authoring tool that we call the Hybrid Image Display Resolutions Optimizer (HYDRO); it features two kinds of control functions by which to optimize the visibility of the touch-activated widgets in terms of placement and resolution. The effectiveness of the present method is shown empirically via a quantitative user study and an eye-tracking-based qualitative evaluation.
{"title":"HYDRO: Optimizing Interactive Hybrid Images for Digital Signage Content","authors":"Masanori Nakayama, Karin Uchino, Ken Nagao, Issei Fujishiro","doi":"10.1016/j.vrih.2022.08.009","DOIUrl":"10.1016/j.vrih.2022.08.009","url":null,"abstract":"<div><p>In modern society, digital signage installed in many large-scale facilities supports our daily life. However, with a limited screen size, it is difficult to provide different types of information for many viewers at varying distances from the screen simultaneously. Therefore, in this study, we extend existing research on the use of hybrid images for tiled displays. To facilitate smoother information selection, a new interactive display method is proposed that incorporates a touchactivated widget as a high-frequency part of the hybrid image; these widgets are novel in that they are more visible to the viewers near to the display. We develop an authoring tool that we call the Hybrid Image Display Resolutions Optimizer (HYDRO); it features two kinds of control functions by which to optimize the visibility of the touch-activated widgets in terms of placement and resolution. The effectiveness of the present method is shown empirically via a quantitative user study and an eye-tracking-based qualitative evaluation.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 6","pages":"Pages 565-577"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579622000845/pdf?md5=66bf13c453add643fb720daf9ae46a21&pid=1-s2.0-S2096579622000845-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139021992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-01DOI: 10.1016/j.vrih.2023.06.003
MingKang Wang , Min Meng , Jigang Liu , Jigang Wu
Cross-modal retrieval has attracted widespread attention in many cross-media similarity search applications, especially image-text retrieval in the fields of computer vision and natural language processing. Recently, visual and semantic embedding (VSE) learning has shown promising improvements on image-text retrieval tasks. Most existing VSE models employ two unrelated encoders to extract features, then use complex methods to contextualize and aggregate those features into holistic embeddings. Despite recent advances, existing approaches still suffer from two limitations: 1) without considering intermediate interaction and adequate alignment between different modalities, these models cannot guarantee the discriminative ability of representations; 2) existing feature aggregators are susceptible to certain noisy regions, which may lead to unreasonable pooling coefficients and affect the quality of the final aggregated features. To address these challenges, we propose a novel cross-modal retrieval model containing a well-designed alignment module and a novel multimodal fusion encoder, which aims to learn adequate alignment and interaction on aggregated features for effectively bridging the modality gap. Experiments on Microsoft COCO and Flickr30k datasets demonstrates the superiority of our model over the state-of-the-art methods.
{"title":"Learning Adequate Alignment and Interaction for Cross-Modal Retrieval","authors":"MingKang Wang , Min Meng , Jigang Liu , Jigang Wu","doi":"10.1016/j.vrih.2023.06.003","DOIUrl":"10.1016/j.vrih.2023.06.003","url":null,"abstract":"<div><p>Cross-modal retrieval has attracted widespread attention in many cross-media similarity search applications, especially image-text retrieval in the fields of computer vision and natural language processing. Recently, visual and semantic embedding (VSE) learning has shown promising improvements on image-text retrieval tasks. Most existing VSE models employ two unrelated encoders to extract features, then use complex methods to contextualize and aggregate those features into holistic embeddings. Despite recent advances, existing approaches still suffer from two limitations: 1) without considering intermediate interaction and adequate alignment between different modalities, these models cannot guarantee the discriminative ability of representations; 2) existing feature aggregators are susceptible to certain noisy regions, which may lead to unreasonable pooling coefficients and affect the quality of the final aggregated features. To address these challenges, we propose a novel cross-modal retrieval model containing a well-designed alignment module and a novel multimodal fusion encoder, which aims to learn adequate alignment and interaction on aggregated features for effectively bridging the modality gap. Experiments on Microsoft COCO and Flickr30k datasets demonstrates the superiority of our model over the state-of-the-art methods.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 6","pages":"Pages 509-522"},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962300027X/pdf?md5=12c947f69173683c04a27c84c4b305fc&pid=1-s2.0-S209657962300027X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139017416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}