Pub Date : 2024-06-01DOI: 10.1016/j.vrih.2023.05.001
Siyi XUN , Yan ZHANG , Sixu DUAN , Mingwei WANG , Jiangang CHEN , Tong TONG , Qinquan GAO , Chantong LAM , Menghan HU , Tao TAN
Background
Magnetic resonance imaging (MRI) has played an important role in the rapid growth of medical imaging diagnostic technology, especially in the diagnosis and treatment of brain tumors owing to its non-invasive characteristics and superior soft tissue contrast. However, brain tumors are characterized by high non-uniformity and non-obvious boundaries in MRI images because of their invasive and highly heterogeneous nature. In addition, the labeling of tumor areas is time-consuming and laborious.
Methods
To address these issues, this study uses a residual grouped convolution module, convolutional block attention module, and bilinear interpolation upsampling method to improve the classical segmentation network U-net. The influence of network normalization, loss function, and network depth on segmentation performance is further considered.
Results
In the experiments, the Dice score of the proposed segmentation model reached 97.581%, which is 12.438% higher than that of traditional U-net, demonstrating the effective segmentation of MRI brain tumor images.
Conclusions
In conclusion, we use the improved U-net network to achieve a good segmentation effect of brain tumor MRI images.
{"title":"ARGA-Unet: Advanced U-net segmentation model using residual grouped convolution and attention mechanism for brain tumor MRI image segmentation","authors":"Siyi XUN , Yan ZHANG , Sixu DUAN , Mingwei WANG , Jiangang CHEN , Tong TONG , Qinquan GAO , Chantong LAM , Menghan HU , Tao TAN","doi":"10.1016/j.vrih.2023.05.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.05.001","url":null,"abstract":"<div><h3>Background</h3><p>Magnetic resonance imaging (MRI) has played an important role in the rapid growth of medical imaging diagnostic technology, especially in the diagnosis and treatment of brain tumors owing to its non-invasive characteristics and superior soft tissue contrast. However, brain tumors are characterized by high non-uniformity and non-obvious boundaries in MRI images because of their invasive and highly heterogeneous nature. In addition, the labeling of tumor areas is time-consuming and laborious.</p></div><div><h3>Methods</h3><p>To address these issues, this study uses a residual grouped convolution module, convolutional block attention module, and bilinear interpolation upsampling method to improve the classical segmentation network U-net. The influence of network normalization, loss function, and network depth on segmentation performance is further considered.</p></div><div><h3>Results</h3><p>In the experiments, the Dice score of the proposed segmentation model reached 97.581%, which is 12.438% higher than that of traditional U-net, demonstrating the effective segmentation of MRI brain tumor images.</p></div><div><h3>Conclusions</h3><p>In conclusion, we use the improved U-net network to achieve a good segmentation effect of brain tumor MRI images.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 203-216"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000232/pdfft?md5=5e16730452951aa1e3b2edacee01d06e&pid=1-s2.0-S2096579623000232-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141481556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.vrih.2024.04.002
Yuanzong Mei , Wenyi Wang , Xi Liu , Wei Yong , Weijie Wu , Yifan Zhu , Shuai Wang , Jianwen Chen
Background
Face image animation generates a synthetic human face video that harmoniously integrates the identity derived from the source image and facial motion obtained from the driving video. This technology could be beneficial in multiple medical fields, such as diagnosis and privacy protection. Previous studies on face animation often relied on a single source image to generate an output video. With a significant pose difference between the source image and the driving frame, the quality of the generated video is likely to be suboptimal because the source image may not provide sufficient features for the warped feature map.
Methods
In this study, we propose a novel face-animation scheme based on multiple sources and perspective alignment to address these issues. We first introduce a multiple-source sampling and selection module to screen the optimal source image set from the provided driving video. We then propose an inter-frame interpolation and alignment module to further eliminate the misalignment between the selected source image and the driving frame.
Conclusions
The proposed method exhibits superior performance in terms of objective metrics and visual quality in large-angle animation scenes compared to other state-of-the-art face animation methods. It indicates the effectiveness of the proposed method in addressing the distortion issues in large-angle animation.
{"title":"Face animation based on multiple sources and perspective alignment","authors":"Yuanzong Mei , Wenyi Wang , Xi Liu , Wei Yong , Weijie Wu , Yifan Zhu , Shuai Wang , Jianwen Chen","doi":"10.1016/j.vrih.2024.04.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2024.04.002","url":null,"abstract":"<div><h3>Background</h3><p>Face image animation generates a synthetic human face video that harmoniously integrates the identity derived from the source image and facial motion obtained from the driving video. This technology could be beneficial in multiple medical fields, such as diagnosis and privacy protection<em>.</em> Previous studies on face animation often relied on a single source image to generate an output video. With a significant pose difference between the source image and the driving frame, the quality of the generated video is likely to be suboptimal because the source image may not provide sufficient features for the warped feature map.</p></div><div><h3>Methods</h3><p>In this study, we propose a novel face-animation scheme based on multiple sources and perspective alignment to address these issues. We first introduce a multiple-source sampling and selection module to screen the optimal source image set from the provided driving video. We then propose an inter-frame interpolation and alignment module to further eliminate the misalignment between the selected source image and the driving frame.</p></div><div><h3>Conclusions</h3><p>The proposed method exhibits superior performance in terms of objective metrics and visual quality in large-angle animation scenes compared to other state-of-the-art face animation methods. It indicates the effectiveness of the proposed method in addressing the distortion issues in large-angle animation.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 252-266"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579624000202/pdfft?md5=2a9475967792588ba319db5427a9033d&pid=1-s2.0-S2096579624000202-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.vrih.2024.04.001
Lai WEI, Menghan HU
Deep learning has been extensively applied to medical image segmentation, resulting in significant advancements in the field of deep neural networks for medical image segmentation since the notable success of U-Net in 2015. However, the application of deep learning models to ocular medical image segmentation poses unique challenges, especially compared to other body parts, due to the complexity, small size, and blurriness of such images, coupled with the scarcity of data. This article aims to provide a comprehensive review of medical image segmentation from two perspectives: the development of deep network structures and the application of segmentation in ocular imaging. Initially, the article introduces an overview of medical imaging, data processing, and performance evaluation metrics. Subsequently, it analyzes recent developments in U-Net-based network structures. Finally, for the segmentation of ocular medical images, the application of deep learning is reviewed and categorized by the type of ocular tissue.
{"title":"A review of medical ocular image segmentation","authors":"Lai WEI, Menghan HU","doi":"10.1016/j.vrih.2024.04.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2024.04.001","url":null,"abstract":"<div><p>Deep learning has been extensively applied to medical image segmentation, resulting in significant advancements in the field of deep neural networks for medical image segmentation since the notable success of U-Net in 2015. However, the application of deep learning models to ocular medical image segmentation poses unique challenges, especially compared to other body parts, due to the complexity, small size, and blurriness of such images, coupled with the scarcity of data. This article aims to provide a comprehensive review of medical image segmentation from two perspectives: the development of deep network structures and the application of segmentation in ocular imaging. Initially, the article introduces an overview of medical imaging, data processing, and performance evaluation metrics. Subsequently, it analyzes recent developments in U-Net-based network structures. Finally, for the segmentation of ocular medical images, the application of deep learning is reviewed and categorized by the type of ocular tissue.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 181-202"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962400010X/pdfft?md5=c30a9952442a34ae8a35e52683ed1214&pid=1-s2.0-S209657962400010X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.vrih.2024.02.001
B.A.O. Lingyun , Zhengrui HUANG , Zehui LIN , Yue SUN , Hui CHEN , You LI , Zhang LI , Xiaochen YUAN , Lin XU , Tao TAN
Background
Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications, particularly in visual recognition tasks such as image and video analyses. There is a growing interest in applying this technology to diverse applications in medical image analysis. Automated three-dimensional Breast Ultrasound is a vital tool for detecting breast cancer, and computer-assisted diagnosis software, developed based on deep learning, can effectively assist radiologists in diagnosis. However, the network model is prone to overfitting during training, owing to challenges such as insufficient training data. This study attempts to solve the problem caused by small datasets and improve model detection performance.
Methods
We propose a breast cancer detection framework based on deep learning (a transfer learning method based on cross-organ cancer detection) and a contrastive learning method based on breast imaging reporting and data systems (BI-RADS).
Results
When using cross organ transfer learning and BIRADS based contrastive learning, the average sensitivity of the model increased by a maximum of 16.05%.
Conclusion
Our experiments have demonstrated that the parameters and experiences of cross-organ cancer detection can be mutually referenced, and contrastive learning method based on BI-RADS can improve the detection performance of the model.
{"title":"Automatic detection of breast lesions in automated 3D breast ultrasound with cross-organ transfer learning","authors":"B.A.O. Lingyun , Zhengrui HUANG , Zehui LIN , Yue SUN , Hui CHEN , You LI , Zhang LI , Xiaochen YUAN , Lin XU , Tao TAN","doi":"10.1016/j.vrih.2024.02.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2024.02.001","url":null,"abstract":"<div><h3>Background</h3><p>Deep convolutional neural networks have garnered considerable attention in numerous machine learning applications, particularly in visual recognition tasks such as image and video analyses. There is a growing interest in applying this technology to diverse applications in medical image analysis. Automated three-dimensional Breast Ultrasound is a vital tool for detecting breast cancer, and computer-assisted diagnosis software, developed based on deep learning, can effectively assist radiologists in diagnosis. However, the network model is prone to overfitting during training, owing to challenges such as insufficient training data. This study attempts to solve the problem caused by small datasets and improve model detection performance.</p></div><div><h3>Methods</h3><p>We propose a breast cancer detection framework based on deep learning (a transfer learning method based on cross-organ cancer detection) and a contrastive learning method based on breast imaging reporting and data systems (BI-RADS).</p></div><div><h3>Results</h3><p>When using cross organ transfer learning and BIRADS based contrastive learning, the average sensitivity of the model increased by a maximum of 16.05%.</p></div><div><h3>Conclusion</h3><p>Our experiments have demonstrated that the parameters and experiences of cross-organ cancer detection can be mutually referenced, and contrastive learning method based on BI-RADS can improve the detection performance of the model.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 239-251"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962400007X/pdfft?md5=a1bdf0d74f499e2548f6f5735dd9b5bf&pid=1-s2.0-S209657962400007X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.vrih.2023.08.002
Hui XIE , Jianfang ZHANG , Lijuan DING , Tao TAN , Qing LI
Background
The prognosis and survival of patients with lung cancer are likely to deteriorate with metastasis. Using deep-learning in the detection of lymph node metastasis can facilitate the noninvasive calculation of the likelihood of such metastasis, thereby providing clinicians with crucial information to enhance diagnostic precision and ultimately improve patient survival and prognosis
Methods
In total, 623 eligible patients were recruited from two medical institutions. Seven deep learning models, namely Alex, GoogLeNet, Resnet18, Resnet101, Vgg16, Vgg19, and MobileNetv3 (small), were utilized to extract deep image histological features. The dimensionality of the extracted features was then reduced using the Spearman correlation coefficient (r ≥ 0.9) and Least Absolute Shrinkage and Selection Operator. Eleven machine learning methods, namely Support Vector Machine, K-nearest neighbor, Random Forest, Extra Trees, XGBoost, LightGBM, Naive Bayes, AdaBoost, Gradient Boosting Decision Tree, Linear Regression, and Multilayer Perceptron, were employed to construct classification prediction models for the filtered final features. The diagnostic performances of the models were assessed using various metrics, including accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, positive predictive value, and negative predictive value. Calibration and decision-curve analyses were also performed.
Results
The present study demonstrated that using deep radiomic features extracted from Vgg16, in conjunction with a prediction model constructed via a linear regression algorithm, effectively distinguished the status of mediastinal lymph nodes in patients with lung cancer. The performance of the model was evaluated based on various metrics, including accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, positive predictive value, and negative predictive value, which yielded values of 0.808, 0.834, 0.851, 0.745, 0.829, and 0.776, respectively. The validation set of the model was assessed using clinical decision curves, calibration curves, and confusion matrices, which collectively demonstrated the model's stability and accuracy
Conclusion
In this study, information on the deep radiomics of Vgg16 was obtained from computed tomography images, and the linear regression method was able to accurately diagnose mediastinal lymph node metastases in patients with lung cancer.
{"title":"Combining machine and deep transfer learning for mediastinal lymph node evaluation in patients with lung cancer","authors":"Hui XIE , Jianfang ZHANG , Lijuan DING , Tao TAN , Qing LI","doi":"10.1016/j.vrih.2023.08.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.08.002","url":null,"abstract":"<div><h3>Background</h3><p>The prognosis and survival of patients with lung cancer are likely to deteriorate with metastasis. Using deep-learning in the detection of lymph node metastasis can facilitate the noninvasive calculation of the likelihood of such metastasis, thereby providing clinicians with crucial information to enhance diagnostic precision and ultimately improve patient survival and prognosis</p></div><div><h3>Methods</h3><p>In total, 623 eligible patients were recruited from two medical institutions. Seven deep learning models, namely Alex, GoogLeNet, Resnet18, Resnet101, Vgg16, Vgg19, and MobileNetv3 (small), were utilized to extract deep image histological features. The dimensionality of the extracted features was then reduced using the Spearman correlation coefficient (r ≥ 0.9) and Least Absolute Shrinkage and Selection Operator. Eleven machine learning methods, namely Support Vector Machine, K-nearest neighbor, Random Forest, Extra Trees, XGBoost, LightGBM, Naive Bayes, AdaBoost, Gradient Boosting Decision Tree, Linear Regression, and Multilayer Perceptron, were employed to construct classification prediction models for the filtered final features. The diagnostic performances of the models were assessed using various metrics, including accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, positive predictive value, and negative predictive value. Calibration and decision-curve analyses were also performed.</p></div><div><h3>Results</h3><p>The present study demonstrated that using deep radiomic features extracted from Vgg16, in conjunction with a prediction model constructed via a linear regression algorithm, effectively distinguished the status of mediastinal lymph nodes in patients with lung cancer. The performance of the model was evaluated based on various metrics, including accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, positive predictive value, and negative predictive value, which yielded values of 0.808, 0.834, 0.851, 0.745, 0.829, and 0.776, respectively. The validation set of the model was assessed using clinical decision curves, calibration curves, and confusion matrices, which collectively demonstrated the model's stability and accuracy</p></div><div><h3>Conclusion</h3><p>In this study, information on the deep radiomics of Vgg16 was obtained from computed tomography images, and the linear regression method was able to accurately diagnose mediastinal lymph node metastases in patients with lung cancer.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 226-238"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000463/pdfft?md5=d355b811e3e99356748d10c345ee1b33&pid=1-s2.0-S2096579623000463-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-01DOI: 10.1016/j.vrih.2023.05.002
Yiman LIU , Size HOU , Xiaoxiang HAN , Tongtong LIANG , Menghan HU , Xin WANG , Wei GU , Yuqi ZHANG , Qingli LI , Jiangang CHEN
Background
Atrial septal defect (ASD) is one of the most common congenital heart diseases. The diagnosis of ASD via transthoracic echocardiography is subjective and time-consuming.
Methods
The objective of this study was to evaluate the feasibility and accuracy of automatic detection of ASD in children based on color Doppler echocardiographic static images using end-to-end convolutional neural networks. The proposed depthwise separable convolution model identifies ASDs with static color Doppler images in a standard view. Among the standard views, we selected two echocardiographic views, i.e., the subcostal sagittal view of the atrium septum and the low parasternal four-chamber view. The developed ASD detection system was validated using a training set consisting of 396 echocardiographic images corresponding to 198 cases. Additionally, an independent test dataset of 112 images corresponding to 56 cases was used, including 101 cases with ASDs and 153 cases with normal hearts.
Results
The average area under the receiver operating characteristic curve, recall, precision, specificity, F1-score, and accuracy of the proposed ASD detection model were 91.99, 80.00, 82.22, 87.50, 79.57, and 83.04, respectively.
Conclusions
The proposed model can accurately and automatically identify ASD, providing a strong foundation for the intelligent diagnosis of congenital heart diseases.
{"title":"Intelligent diagnosis of atrial septal defect in children using echocardiography with deep learning","authors":"Yiman LIU , Size HOU , Xiaoxiang HAN , Tongtong LIANG , Menghan HU , Xin WANG , Wei GU , Yuqi ZHANG , Qingli LI , Jiangang CHEN","doi":"10.1016/j.vrih.2023.05.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.05.002","url":null,"abstract":"<div><h3>Background</h3><p>Atrial septal defect (ASD) is one of the most common congenital heart diseases. The diagnosis of ASD via transthoracic echocardiography is subjective and time-consuming.</p></div><div><h3>Methods</h3><p>The objective of this study was to evaluate the feasibility and accuracy of automatic detection of ASD in children based on color Doppler echocardiographic static images using end-to-end convolutional neural networks. The proposed depthwise separable convolution model identifies ASDs with static color Doppler images in a standard view. Among the standard views, we selected two echocardiographic views, i.e., the subcostal sagittal view of the atrium septum and the low parasternal four-chamber view. The developed ASD detection system was validated using a training set consisting of 396 echocardiographic images corresponding to 198 cases. Additionally, an independent test dataset of 112 images corresponding to 56 cases was used, including 101 cases with ASDs and 153 cases with normal hearts.</p></div><div><h3>Results</h3><p>The average area under the receiver operating characteristic curve, recall, precision, specificity, F1-score, and accuracy of the proposed ASD detection model were 91.99, 80.00, 82.22, 87.50, 79.57, and 83.04, respectively.</p></div><div><h3>Conclusions</h3><p>The proposed model can accurately and automatically identify ASD, providing a strong foundation for the intelligent diagnosis of congenital heart diseases.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 3","pages":"Pages 217-225"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000244/pdfft?md5=3ade0d91e713f6555fd1c75181120add&pid=1-s2.0-S2096579623000244-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.vrih.2023.10.007
Hans-Georg Enkler , Wolfgang Kunert , Stefan Pfeffer , Kai-Jonas Bock , Steffen Axt , Jonas Johannink , Christoph Reich
Background
Laparoscopic surgery is a surgical technique in which special instruments are inserted through small incision holes inside the body. For some time, efforts have been made to improve surgical pre-training through practical exercises on abstracted and reduced models.
Methods
The authors strive for a portable, easy to use and cost-effective Virtual Reality-based (VR) laparoscopic pre-training platform and therefore address the question of how such a system has to be designed to achieve the quality of today's gold standard using real tissue specimens. Current VR controllers are limited regarding haptic feedback. Since haptic feedback is necessary or at least beneficial for laparoscopic surgery training, the platform to be developed consists of a newly designed prototype laparoscopic VR controller with haptic feedback, a commercially available head-mounted display, a VR environment for simulating a laparoscopic surgery, and a training concept.
Results
To take full advantage of benefits such as repeatability and cost-effectiveness of VR-based training, the system shall not require a tissue sample for haptic feedback. It is currently calculated and visually displayed to the user in the VR environment. On the prototype controller, a first axis was provided with perceptible feedback for test purposes. Two of the prototype VR controllers can be combined to simulate a typical both-handed use case, e.g., laparoscopic suturing. A Unity-based VR prototype allows the execution of simple standard pre-trainings.
Conclusions
The first prototype enables full operation of a virtual laparoscopic instrument in VR. In addition, the simulation can compute simple interaction forces. Major challenges lie in a realistic real-time tissue simulation and calculation of forces for the haptic feedback. Mechanical weaknesses were identified in the first hardware prototype, which will be improved in subsequent versions. All degrees of freedom of the controller are to be provided with haptic feedback. To make forces tangible in the simulation, characteristic values need to be determined using real tissue samples. The system has yet to be validated by cross-comparing real and VR haptics with surgeons.
{"title":"Towards engineering a portable platform for laparoscopic pre-training in virtual reality with haptic feedback","authors":"Hans-Georg Enkler , Wolfgang Kunert , Stefan Pfeffer , Kai-Jonas Bock , Steffen Axt , Jonas Johannink , Christoph Reich","doi":"10.1016/j.vrih.2023.10.007","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.10.007","url":null,"abstract":"<div><h3>Background</h3><p>Laparoscopic surgery is a surgical technique in which special instruments are inserted through small incision holes inside the body. For some time, efforts have been made to improve surgical pre-training through practical exercises on abstracted and reduced models.</p></div><div><h3>Methods</h3><p>The authors strive for a portable, easy to use and cost-effective Virtual Reality-based (VR) laparoscopic pre-training platform and therefore address the question of how such a system has to be designed to achieve the quality of today's gold standard using real tissue specimens. Current VR controllers are limited regarding haptic feedback. Since haptic feedback is necessary or at least beneficial for laparoscopic surgery training, the platform to be developed consists of a newly designed prototype laparoscopic VR controller with haptic feedback, a commercially available head-mounted display, a VR environment for simulating a laparoscopic surgery, and a training concept.</p></div><div><h3>Results</h3><p>To take full advantage of benefits such as repeatability and cost-effectiveness of VR-based training, the system shall not require a tissue sample for haptic feedback. It is currently calculated and visually displayed to the user in the VR environment. On the prototype controller, a first axis was provided with perceptible feedback for test purposes. Two of the prototype VR controllers can be combined to simulate a typical both-handed use case, e.g., laparoscopic suturing. A Unity-based VR prototype allows the execution of simple standard pre-trainings.</p></div><div><h3>Conclusions</h3><p>The first prototype enables full operation of a virtual laparoscopic instrument in VR. In addition, the simulation can compute simple interaction forces. Major challenges lie in a realistic real-time tissue simulation and calculation of forces for the haptic feedback. Mechanical weaknesses were identified in the first hardware prototype, which will be improved in subsequent versions. All degrees of freedom of the controller are to be provided with haptic feedback. To make forces tangible in the simulation, characteristic values need to be determined using real tissue samples. The system has yet to be validated by cross-comparing real and VR haptics with surgeons.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 2","pages":"Pages 83-99"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S209657962300075X/pdf?md5=d39d1a5a15a4f73d021bdb17019133aa&pid=1-s2.0-S209657962300075X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140880249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.vrih.2023.11.001
Yihua Bao , Jie Guo , Dongdong Weng , Yue Liu , Zeyu Tian
Background
Physical entity interactions in mixed reality (MR) environments aim to harness human capabilities in manipulating physical objects, thereby enhancing virtual environment (VEs) functionality. In MR, a common strategy is to use virtual agents as substitutes for physical entities, balancing interaction efficiency with environmental immersion. However, the impact of virtual agent size and form on interaction performance remains unclear.
Methods
Two experiments were conducted to explore how virtual agent size and form affect interaction performance, immersion, and preference in MR environments. The first experiment assessed five virtual agent sizes (25%, 50%, 75%, 100%, and 125% of physical size). The second experiment tested four types of frames (no frame, consistent frame, half frame, and surrounding frame) across all agent sizes. Participants, utilizing a head-mounted display, performed tasks involving moving cups, typing words, and using a mouse. They completed questionnaires assessing aspects such as the virtual environment effects, interaction effects, collision concerns, and preferences.
Results
Results from the first experiment revealed that agents matching physical object size produced the best overall performance. The second experiment demonstrated that consistent framing notably enhances interaction accuracy and speed but reduces immersion. To balance efficiency and immersion, frameless agents matching physical object sizes were deemed optimal.
Conclusions
Virtual agents matching physical entity sizes enhance user experience and interaction performance. Conversely, familiar frames from 2D interfaces detrimentally affect interaction and immersion in virtual spaces. This study provides valuable insights for the future development of MR systems.
{"title":"Effects of virtual agents on interaction efficiency and environmental immersion in MR environments","authors":"Yihua Bao , Jie Guo , Dongdong Weng , Yue Liu , Zeyu Tian","doi":"10.1016/j.vrih.2023.11.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.11.001","url":null,"abstract":"<div><h3>Background</h3><p>Physical entity interactions in mixed reality (MR) environments aim to harness human capabilities in manipulating physical objects, thereby enhancing virtual environment (VEs) functionality. In MR, a common strategy is to use virtual agents as substitutes for physical entities, balancing interaction efficiency with environmental immersion. However, the impact of virtual agent size and form on interaction performance remains unclear.</p></div><div><h3>Methods</h3><p>Two experiments were conducted to explore how virtual agent size and form affect interaction performance, immersion, and preference in MR environments. The first experiment assessed five virtual agent sizes (25%, 50%, 75%, 100%, and 125% of physical size). The second experiment tested four types of frames (no frame, consistent frame, half frame, and surrounding frame) across all agent sizes. Participants, utilizing a head-mounted display, performed tasks involving moving cups, typing words, and using a mouse. They completed questionnaires assessing aspects such as the virtual environment effects, interaction effects, collision concerns, and preferences.</p></div><div><h3>Results</h3><p>Results from the first experiment revealed that agents matching physical object size produced the best overall performance. The second experiment demonstrated that consistent framing notably enhances interaction accuracy and speed but reduces immersion. To balance efficiency and immersion, frameless agents matching physical object sizes were deemed optimal.</p></div><div><h3>Conclusions</h3><p>Virtual agents matching physical entity sizes enhance user experience and interaction performance. Conversely, familiar frames from 2D interfaces detrimentally affect interaction and immersion in virtual spaces. This study provides valuable insights for the future development of MR systems.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 2","pages":"Pages 169-179"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000761/pdf?md5=79a7ef4bebb12cdd0b6fb18240dafefc&pid=1-s2.0-S2096579623000761-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140880274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.vrih.2023.12.002
Jovana Plavšić, Ilija Mišković
Background
Traditional methods for monitoring mining equipment rely primarily on visual inspections, which are time-consuming, inefficient, and hazardous. This article introduces a novel approach to monitoring mission-critical systems and services in the mining industry by integrating virtual reality (VR) and digital twin (DT) technologies. VR-based DTs enable remote equipment monitoring, advanced analysis of machine health, enhanced visualization, and improved decision making.
Methods
This article presents an architecture for VR-based DT development, including the developmental stages, activities, and stakeholders involved. A case study on the condition monitoring of a conveyor belt using real-time synthetic vibration sensor data was conducted using the proposed methodology. The study demonstrated the application of the methodology in remote monitoring and identified the need for further development for implementation in active mining operations. The article also discusses interdisciplinarity, choice of tools, computational resources, time and cost, human involvement, user acceptance, frequency of inspection, multiuser environment, potential risks, and applications beyond the mining industry.
Results
The findings of this study provide a foundation for future research in the domain of VR-based DTs for remote equipment monitoring and a novel application area for VR in mining.
{"title":"VR-based digital twin for remote monitoring of mining equipment: Architecture and a case study","authors":"Jovana Plavšić, Ilija Mišković","doi":"10.1016/j.vrih.2023.12.002","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.12.002","url":null,"abstract":"<div><h3>Background</h3><p>Traditional methods for monitoring mining equipment rely primarily on visual inspections, which are time-consuming, inefficient, and hazardous. This article introduces a novel approach to monitoring mission-critical systems and services in the mining industry by integrating virtual reality (VR) and digital twin (DT) technologies. VR-based DTs enable remote equipment monitoring, advanced analysis of machine health, enhanced visualization, and improved decision making.</p></div><div><h3>Methods</h3><p>This article presents an architecture for VR-based DT development, including the developmental stages, activities, and stakeholders involved. A case study on the condition monitoring of a conveyor belt using real-time synthetic vibration sensor data was conducted using the proposed methodology. The study demonstrated the application of the methodology in remote monitoring and identified the need for further development for implementation in active mining operations. The article also discusses interdisciplinarity, choice of tools, computational resources, time and cost, human involvement, user acceptance, frequency of inspection, multiuser environment, potential risks, and applications beyond the mining industry.</p></div><div><h3>Results</h3><p>The findings of this study provide a foundation for future research in the domain of VR-based DTs for remote equipment monitoring and a novel application area for VR in mining.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 2","pages":"Pages 100-112"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000852/pdf?md5=fc1470df3595a2597f7acf4dc88f0ea0&pid=1-s2.0-S2096579623000852-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140880270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-01DOI: 10.1016/j.vrih.2023.12.001
Xiaofei Ren , Jian He , Teng Han , Songxian Liu , Mengfei Lv , Rui Zhou
Background
The sense of touch plays a crucial role in interactive behavior within virtual spaces, particularly when visual attention is absent. Although haptic feedback has been widely used to compensate for the lack of visual cues, the use of tactile information as a predictive feedforward cue to guide hand movements remains unexplored and lacks theoretical understanding.
Methods
This study introduces a fingertip aero-haptic rendering method to investigate its effectiveness in directing hand movements during eyes-free spatial interactions. The wearable device incorporates a multichannel micro-airflow chamber to deliver adjustable tactile effects on the fingertips.
Results
The first study verified that tactile directional feedforward cues significantly improve user capabilities in eyes-free target acquisition and that users rely heavily on haptic indications rather than spatial memory to control their hands. A subsequent study examined the impact of enriched tactile feedforward cues on assisting users in determining precise target positions during eyes-free interactions, and assessed the required learning efforts.
Conclusions
The haptic feedforward effect holds great practical promise in eyeless design for virtual reality. We aim to integrate cognitive models and tactile feedforward cues in the future, and apply richer tactile feedforward information to alleviate users' perceptual deficiencies.
{"title":"Exploring the effect of fingertip aero-haptic feedforward cues in directing eyes-free target acquisition in VR","authors":"Xiaofei Ren , Jian He , Teng Han , Songxian Liu , Mengfei Lv , Rui Zhou","doi":"10.1016/j.vrih.2023.12.001","DOIUrl":"https://doi.org/10.1016/j.vrih.2023.12.001","url":null,"abstract":"<div><h3>Background</h3><p>The sense of touch plays a crucial role in interactive behavior within virtual spaces, particularly when visual attention is absent. Although haptic feedback has been widely used to compensate for the lack of visual cues, the use of tactile information as a predictive feedforward cue to guide hand movements remains unexplored and lacks theoretical understanding.</p></div><div><h3>Methods</h3><p>This study introduces a fingertip aero-haptic rendering method to investigate its effectiveness in directing hand movements during eyes-free spatial interactions. The wearable device incorporates a multichannel micro-airflow chamber to deliver adjustable tactile effects on the fingertips.</p></div><div><h3>Results</h3><p>The first study verified that tactile directional feedforward cues significantly improve user capabilities in eyes-free target acquisition and that users rely heavily on haptic indications rather than spatial memory to control their hands. A subsequent study examined the impact of enriched tactile feedforward cues on assisting users in determining precise target positions during eyes-free interactions, and assessed the required learning efforts.</p></div><div><h3>Conclusions</h3><p>The haptic feedforward effect holds great practical promise in eyeless design for virtual reality. We aim to integrate cognitive models and tactile feedforward cues in the future, and apply richer tactile feedforward information to alleviate users' perceptual deficiencies.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"6 2","pages":"Pages 113-131"},"PeriodicalIF":0.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2096579623000839/pdf?md5=d8fff3e7495bcc4ee949335d5463ff3c&pid=1-s2.0-S2096579623000839-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140880271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}