Pub Date : 2025-11-07DOI: 10.1109/OJID.2025.3630586
Shihao Luo;Truong Cong Thang
Novel View Synthesis (NVS) aims to generate unseen views of a scene from limited observations and has become foundational to immersive multimedia through recent advances in neural rendering. In immersive displays, the success of an NVS system mainly depends on the overall perceptual quality it can provide. However, most neural rendering models report only the mean (the average) values of synthesized novel views as the overall quality performance indicator. This practice lacks validation, raising the question of whether relying solely on the mean is sufficient to capture overall perceptual quality. In this work, we present a statistical examination to investigate whether overall perceptual quality in neural rendering is sufficiently represented by the mean alone or requires additional measures. Our results highlight the dominant role of mean quality while introducing a more comprehensive statistical framework for overall quality assessment in neural rendering. We demonstrate that other statistical measures offer additional aspects of the perceptual quality that the mean alone cannot fully represent.
{"title":"Beyond the Mean: Statistical Measures for Quantifying Perceptual Quality in Neural Rendering","authors":"Shihao Luo;Truong Cong Thang","doi":"10.1109/OJID.2025.3630586","DOIUrl":"https://doi.org/10.1109/OJID.2025.3630586","url":null,"abstract":"Novel View Synthesis (NVS) aims to generate unseen views of a scene from limited observations and has become foundational to immersive multimedia through recent advances in neural rendering. In immersive displays, the success of an NVS system mainly depends on the overall perceptual quality it can provide. However, most neural rendering models report only the mean (the average) values of synthesized novel views as the overall quality performance indicator. This practice lacks validation, raising the question of whether relying solely on the mean is sufficient to capture overall perceptual quality. In this work, we present a statistical examination to investigate whether overall perceptual quality in neural rendering is sufficiently represented by the mean alone or requires additional measures. Our results highlight the dominant role of mean quality while introducing a more comprehensive statistical framework for overall quality assessment in neural rendering. We demonstrate that other statistical measures offer additional aspects of the perceptual quality that the mean alone cannot fully represent.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"200-209"},"PeriodicalIF":0.0,"publicationDate":"2025-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11235564","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-15DOI: 10.1109/OJID.2025.3622128
Sung-Min Jung;Laurynas Valantinas;Leon Preston;Pawan K. Shrestha
Driven by the growing demand for enhanced visual interaction with the physical world, augmented reality (AR) display technologies have rapidly emerged as transformative interfaces, bridging digital information seamlessly with real-world environments. Conventional optical architectures—such as birdbath, freeform combiners, holographic optical elements (HOEs), and waveguide couplers—offer trade-offs in resolution, brightness, transparency, and field of view (FOV) but often suffer from accommodation mismatch and vergence–accommodation conflict (VAC), causing visual discomfort. Maxwellian-view or retina projection displays address these issues by projecting focused light bundles directly into the pupil, producing depth-invariant images that remain sharp regardless of the eye’s accommodative state. This principle provides extended depth of field (DOF) and enables compact optical designs, though challenges such as a narrow eye-box remain. Recent advances—such as exit pupil expansion via multiple replications of converging points using HOE optics, beam splitter arrays, integration of holographic elements, and selective LED illumination—are enhancing the practicality of Maxwellian systems. By combining these with adaptive optics, intelligent tracking, and hybrid focal strategies, next-generation AR displays could deliver superior comfort, immersion, and usability. This review outlines the principles, advantages, limitations, and recent developments of Maxwellian AR optics, highlighting their potential as a transformative foundation for future AR display technologies across medical, industrial, and consumer applications.
{"title":"Maxwellian-View Augmented Reality Displays With Extended Depth of Field","authors":"Sung-Min Jung;Laurynas Valantinas;Leon Preston;Pawan K. Shrestha","doi":"10.1109/OJID.2025.3622128","DOIUrl":"https://doi.org/10.1109/OJID.2025.3622128","url":null,"abstract":"Driven by the growing demand for enhanced visual interaction with the physical world, augmented reality (AR) display technologies have rapidly emerged as transformative interfaces, bridging digital information seamlessly with real-world environments. Conventional optical architectures—such as birdbath, freeform combiners, holographic optical elements (HOEs), and waveguide couplers—offer trade-offs in resolution, brightness, transparency, and field of view (FOV) but often suffer from accommodation mismatch and vergence–accommodation conflict (VAC), causing visual discomfort. Maxwellian-view or retina projection displays address these issues by projecting focused light bundles directly into the pupil, producing depth-invariant images that remain sharp regardless of the eye’s accommodative state. This principle provides extended depth of field (DOF) and enables compact optical designs, though challenges such as a narrow eye-box remain. Recent advances—such as exit pupil expansion via multiple replications of converging points using HOE optics, beam splitter arrays, integration of holographic elements, and selective LED illumination—are enhancing the practicality of Maxwellian systems. By combining these with adaptive optics, intelligent tracking, and hybrid focal strategies, next-generation AR displays could deliver superior comfort, immersion, and usability. This review outlines the principles, advantages, limitations, and recent developments of Maxwellian AR optics, highlighting their potential as a transformative foundation for future AR display technologies across medical, industrial, and consumer applications.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"114-121"},"PeriodicalIF":0.0,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11204834","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145405440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-15DOI: 10.1109/OJID.2025.3622139
Wenyu Yang;Zihe Zhao;Shuo Gao
In the emerging metaverse, immersive experiences depend on the integration of multimodal sensory channels, with haptic feedback playing a central role in enhancing realism and interaction. Recent progress in wearable devices, force feedback, and neural interfaces has driven research on combining tactile sensations with visual, auditory, olfactory, thermal, and force-related signals. Yet, current systems still face challenges in latency, energy efficiency, synchronisation, and personalisation, limiting their ability to match the complexity of human perception. This paper reviews the trajectory of haptic and multimodal perception fusion in the metaverse. It first introduces the biological and psychological foundations of touch, then discusses tactile technologies and device configurations. Next, it examines multimodal fusion models and mechanisms through comparative analyses and application evaluations, focusing on spatio-temporal synchronisation, cross-modal compensation, perceptual enhancement, and attentional allocation. The review provides an overview of technical approaches, implementation strategies, and application challenges, offering both theoretical grounding and practical insights for designing synchronised, real-time, personalised, and scalable multisensory interaction systems.
{"title":"From Touch to Immersion: A Systematic Review of Haptic and Multimodal Perception in the Metaverse","authors":"Wenyu Yang;Zihe Zhao;Shuo Gao","doi":"10.1109/OJID.2025.3622139","DOIUrl":"https://doi.org/10.1109/OJID.2025.3622139","url":null,"abstract":"In the emerging metaverse, immersive experiences depend on the integration of multimodal sensory channels, with haptic feedback playing a central role in enhancing realism and interaction. Recent progress in wearable devices, force feedback, and neural interfaces has driven research on combining tactile sensations with visual, auditory, olfactory, thermal, and force-related signals. Yet, current systems still face challenges in latency, energy efficiency, synchronisation, and personalisation, limiting their ability to match the complexity of human perception. This paper reviews the trajectory of haptic and multimodal perception fusion in the metaverse. It first introduces the biological and psychological foundations of touch, then discusses tactile technologies and device configurations. Next, it examines multimodal fusion models and mechanisms through comparative analyses and application evaluations, focusing on spatio-temporal synchronisation, cross-modal compensation, perceptual enhancement, and attentional allocation. The review provides an overview of technical approaches, implementation strategies, and application challenges, offering both theoretical grounding and practical insights for designing synchronised, real-time, personalised, and scalable multisensory interaction systems.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"122-138"},"PeriodicalIF":0.0,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11204837","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-14DOI: 10.1109/OJID.2025.3620828
Dexing Liu;Rui Qiu;Jingwen Liu;Can Li;Min Zhang
Flexible and stretchable electronics hold great promise for wearable devices and health monitoring systems owing to their mechanical conformability, light weight, and seamless integration capabilities. Nevertheless, achieving high electrical performance while retaining flexibility and operational stability remains a considerable challenge. Among various candidate materials, carbon nanotubes (CNTs) have garnered significant attention due to their exceptional carrier mobility, inherent mechanical flexibility, and compatibility with low-temperature fabrication processes. This article provides a comprehensive overview of recent progress in flexible and stretchable electronics, focusing specifically on advances in CNT-based thin-film transistors and integrated circuits. Furthermore, it examines emerging applications of CNTs in artificial neuromorphic systems and soft sensors, underscoring their potential to enable next-generation soft electronic technologies.
{"title":"Carbon Nanotube-Based Flexible and Stretchable Electronics","authors":"Dexing Liu;Rui Qiu;Jingwen Liu;Can Li;Min Zhang","doi":"10.1109/OJID.2025.3620828","DOIUrl":"https://doi.org/10.1109/OJID.2025.3620828","url":null,"abstract":"Flexible and stretchable electronics hold great promise for wearable devices and health monitoring systems owing to their mechanical conformability, light weight, and seamless integration capabilities. Nevertheless, achieving high electrical performance while retaining flexibility and operational stability remains a considerable challenge. Among various candidate materials, carbon nanotubes (CNTs) have garnered significant attention due to their exceptional carrier mobility, inherent mechanical flexibility, and compatibility with low-temperature fabrication processes. This article provides a comprehensive overview of recent progress in flexible and stretchable electronics, focusing specifically on advances in CNT-based thin-film transistors and integrated circuits. Furthermore, it examines emerging applications of CNTs in artificial neuromorphic systems and soft sensors, underscoring their potential to enable next-generation soft electronic technologies.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"139-165"},"PeriodicalIF":0.0,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11202373","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-01DOI: 10.1109/OJID.2025.3616949
Dmitry Shmunk
A novel, fast, and robust method for 3D eye pose tracking that leverages the anatomical constancy of the human iris to improve accuracy and computational efficiency is proposed. Traditional pupil-based methods suffer from limitations due to pupil size variability, decentering, and the need for complex corrections for refraction through the corneal bulge. In contrast, the iris, due to its fixed size and direct visibility, serves as a more reliable feature for precise eye pose estimation. Our method combines key advantages of both model-based and regression-based approaches without requiring external glint-producing light sources or high computational overheads associated with neural-network-based solutions. The iris is used as the primary tracking feature, enabling robust detection even under partial occlusion and in users wearing prescription eyewear. Exploiting the consistent geometry of the iris, we estimate gaze direction and 3D eye position with high precision. Unlike existing methods, the proposed approach minimizes reliance on pupil measurements, employing the pupil’s high contrast only to augment iris detection. This strategy ensures robustness in real-world scenarios, including varying illumination and stray light/glints/distortions introduced by corrective eyewear. Experimental results show that the method achieves low computational cost while maintaining state-of-the-art performance.
{"title":"Eye Pose Estimation and Tracking Using Iris as a Base Feature","authors":"Dmitry Shmunk","doi":"10.1109/OJID.2025.3616949","DOIUrl":"https://doi.org/10.1109/OJID.2025.3616949","url":null,"abstract":"A novel, fast, and robust method for 3D eye pose tracking that leverages the anatomical constancy of the human iris to improve accuracy and computational efficiency is proposed. Traditional pupil-based methods suffer from limitations due to pupil size variability, decentering, and the need for complex corrections for refraction through the corneal bulge. In contrast, the iris, due to its fixed size and direct visibility, serves as a more reliable feature for precise eye pose estimation. Our method combines key advantages of both model-based and regression-based approaches without requiring external glint-producing light sources or high computational overheads associated with neural-network-based solutions. The iris is used as the primary tracking feature, enabling robust detection even under partial occlusion and in users wearing prescription eyewear. Exploiting the consistent geometry of the iris, we estimate gaze direction and 3D eye position with high precision. Unlike existing methods, the proposed approach minimizes reliance on pupil measurements, employing the pupil’s high contrast only to augment iris detection. This strategy ensures robustness in real-world scenarios, including varying illumination and stray light/glints/distortions introduced by corrective eyewear. Experimental results show that the method achieves low computational cost while maintaining state-of-the-art performance.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"96-105"},"PeriodicalIF":0.0,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11189046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In human-robot cohabited environments, generating socially acceptable and human-like trajectories is critical to fostering safe, comfortable, and intuitive interactions. This paper presents a trajectory prediction framework that emulates human walking behavior by incorporating social dynamics and comfort-driven optimization, specifically within immersive virtual environments. Leveraging the Social Locomotion Model (SLM), our framework captures inter-personal interactions and spatial preferences, modeling how humans implicitly adjust paths to maintain social norms. We further introduce a Nelder-Mead-based optimization process to refine robot trajectories under these constraints, ensuring both goal-directedness and human-likeness with efficiency and applicability. To evaluate the perceptual realism and spatial comfort of the generated trajectories, we conduct a user study in a virtual reality (VR) setting, where participants experience and assess various robot navigation behaviors from a first-person perspective. Subjective feedback indicates that the trajectories optimized by our model are perceived to be significantly more natural and comfortable than those generated by baseline approaches. Our framework demonstrates strong potential for deployment in virtual human-robot interaction systems, where social legibility, responsiveness, and computational efficiency are all critical.
在人-机器人共存的环境中,产生社会上可接受的和类似人类的轨迹对于促进安全、舒适和直观的交互至关重要。本文提出了一个轨迹预测框架,通过结合社会动态和舒适驱动优化,特别是在沉浸式虚拟环境中模拟人类行走行为。利用社会运动模型(Social movement Model, SLM),我们的框架捕捉了人际互动和空间偏好,模拟了人类如何隐式调整路径以维持社会规范。我们进一步引入了一种基于nelder - mead的优化过程来细化这些约束下的机器人轨迹,以确保目标定向性和人类相似性,并具有效率和适用性。为了评估生成轨迹的感知真实感和空间舒适性,我们在虚拟现实(VR)环境中进行了一项用户研究,参与者从第一人称视角体验和评估各种机器人导航行为。主观反馈表明,我们的模型优化的轨迹被认为比基线方法产生的轨迹更加自然和舒适。我们的框架展示了在虚拟人机交互系统中部署的强大潜力,其中社会易读性,响应性和计算效率都至关重要。
{"title":"Comfort-Aware Trajectory Optimization for Immersive Human-Robot Interaction","authors":"Yitian Kou;Dandan Zhu;Hao Zeng;Kaiwei Zhang;Xiaoxiao Sui;Xiongkuo Min;Guangtao Zhai","doi":"10.1109/OJID.2025.3614514","DOIUrl":"https://doi.org/10.1109/OJID.2025.3614514","url":null,"abstract":"In human-robot cohabited environments, generating socially acceptable and human-like trajectories is critical to fostering safe, comfortable, and intuitive interactions. This paper presents a trajectory prediction framework that emulates human walking behavior by incorporating social dynamics and comfort-driven optimization, specifically within immersive virtual environments. Leveraging the Social Locomotion Model (SLM), our framework captures inter-personal interactions and spatial preferences, modeling how humans implicitly adjust paths to maintain social norms. We further introduce a Nelder-Mead-based optimization process to refine robot trajectories under these constraints, ensuring both goal-directedness and human-likeness with efficiency and applicability. To evaluate the perceptual realism and spatial comfort of the generated trajectories, we conduct a user study in a virtual reality (VR) setting, where participants experience and assess various robot navigation behaviors from a first-person perspective. Subjective feedback indicates that the trajectories optimized by our model are perceived to be significantly more natural and comfortable than those generated by baseline approaches. Our framework demonstrates strong potential for deployment in virtual human-robot interaction systems, where social legibility, responsiveness, and computational efficiency are all critical.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"106-113"},"PeriodicalIF":0.0,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11180046","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-16DOI: 10.1109/OJID.2025.3610414
Eunkyung Koh;Hyeon-Deuk Kim;Rokyeon Kim;Byoungtaek Son;Sang-Hoon Lee;Gyehyun Park;Eui-Cheol Shin;Yongsoo Lee;Insoo Wang
Oxide thin-film transistors (TFTs) are critical components in modern display technologies due to their high mobility, optical transparency, and low-temperature processability. As the design space expands across material systems, device architectures, and operating conditions, there is a growing demand for computational methods that support reliable and efficient modeling. This review presents a comprehensive overview of AI- and physics-based methods for oxide TFTs, spanning from atomistic material analysis to circuit-level modeling. We discuss atomistic simulations such as density functional theory (DFT) and molecular dynamics (MD) for defect energetics and carrier behavior, technology computer-aided design (TCAD) for device-level electrothermal analysis, and compact models for circuit simulation. The role of artificial intelligence in surrogate modeling, parameter extraction, optimization of materials, device structures, and processes is discussed. By bridging simulation methods across multiple scales, this review provides insights into accelerating the design, and analysis of oxide TFTs.
{"title":"AI and Physics-Based Computational Methods for Oxide Thin-Film Transistors: A Review","authors":"Eunkyung Koh;Hyeon-Deuk Kim;Rokyeon Kim;Byoungtaek Son;Sang-Hoon Lee;Gyehyun Park;Eui-Cheol Shin;Yongsoo Lee;Insoo Wang","doi":"10.1109/OJID.2025.3610414","DOIUrl":"https://doi.org/10.1109/OJID.2025.3610414","url":null,"abstract":"Oxide thin-film transistors (TFTs) are critical components in modern display technologies due to their high mobility, optical transparency, and low-temperature processability. As the design space expands across material systems, device architectures, and operating conditions, there is a growing demand for computational methods that support reliable and efficient modeling. This review presents a comprehensive overview of AI- and physics-based methods for oxide TFTs, spanning from atomistic material analysis to circuit-level modeling. We discuss atomistic simulations such as density functional theory (DFT) and molecular dynamics (MD) for defect energetics and carrier behavior, technology computer-aided design (TCAD) for device-level electrothermal analysis, and compact models for circuit simulation. The role of artificial intelligence in surrogate modeling, parameter extraction, optimization of materials, device structures, and processes is discussed. By bridging simulation methods across multiple scales, this review provides insights into accelerating the design, and analysis of oxide TFTs.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"89-95"},"PeriodicalIF":0.0,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11165197","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-16DOI: 10.1109/OJID.2025.3610449
Yun Liu;Sifan Li;Zihan Liu;Haiyuan Wang;Daoxin Fan
Stereoscopic omnidirectional image quality assessment is a combination task of stereoscopic image quality assessment and omnidirectional image quality assessment, which is more challenging than traditional three-dimensional images. Previous works fail to present a satisfying performance due to neglecting human brain perception mechanism. To solve the above problem, we proposed an effective brain-perception guided interactive network for stereoscopic omnidirectional image quality assessment (BPGI), which is built following three perception steps: visual information processing, feature fusion cognition, and quality evaluation. Considering the stereoscopic perception characteristics, binocular and monocular visual features are both extracted. Following human complex cognition mechanism, a Bi-LSTM module is introduced to dig the deeply inherent relationship between monocular and binocular visual feature and improve the feature representation ability of the proposed model. Then a visual feature fusion module is built to obtain effective interactive fusion for quality prediction. Experimental results prove that the proposed model outperforms many state-of-the-art models, and can be effectively applied to predict the quality of stereoscopic omnidirectional images.
{"title":"BPGI: A Brain-Perception Guided Interactive Network for Stereoscopic Omnidirectional Image Quality Assessment","authors":"Yun Liu;Sifan Li;Zihan Liu;Haiyuan Wang;Daoxin Fan","doi":"10.1109/OJID.2025.3610449","DOIUrl":"https://doi.org/10.1109/OJID.2025.3610449","url":null,"abstract":"Stereoscopic omnidirectional image quality assessment is a combination task of stereoscopic image quality assessment and omnidirectional image quality assessment, which is more challenging than traditional three-dimensional images. Previous works fail to present a satisfying performance due to neglecting human brain perception mechanism. To solve the above problem, we proposed an effective brain-perception guided interactive network for stereoscopic omnidirectional image quality assessment (BPGI), which is built following three perception steps: visual information processing, feature fusion cognition, and quality evaluation. Considering the stereoscopic perception characteristics, binocular and monocular visual features are both extracted. Following human complex cognition mechanism, a Bi-LSTM module is introduced to dig the deeply inherent relationship between monocular and binocular visual feature and improve the feature representation ability of the proposed model. Then a visual feature fusion module is built to obtain effective interactive fusion for quality prediction. Experimental results prove that the proposed model outperforms many state-of-the-art models, and can be effectively applied to predict the quality of stereoscopic omnidirectional images.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"81-88"},"PeriodicalIF":0.0,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11165215","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-05DOI: 10.1109/OJID.2025.3606863
Shijie Xing;Shiqi Luo;Kai Wang
Virtual buttons on touch screens have assumed an increasingly prominent role within the field of human-computer interaction, functioning as a critical interface modality between users and digital systems. Following their emergence as a widely adopted alternative to traditional input devices such as the mouse and keyboard, virtual buttons have been extensively implemented across a broad spectrum of sectors, including manufacturing, healthcare, finance, and education. Their key advantages—including compact physical design, minimal hardware requirements, ease of system integration, and a relatively low user learning threshold—contribute to their suitability for both consumer-oriented products and domain-specific industrial applications. This paper presents a comprehensive review of numerous influential studies published over recent decades, offering a structured synthesis of the technological progression and functional evolution of virtual button systems. It provides a comparative analysis of various implementations grounded in distinct touch-sensing technologies, such as capacitive, resistive, piezoelectric, and infrared mechanisms, and evaluates their applicability in contexts including consumer electronics and in-vehicle interfaces. The discussion concludes with an examination of emerging trends and prospective development trajectories aimed at addressing the growing demand for intelligent, adaptive virtual button solutions.
{"title":"Immersive Touch-Interactive Display Enabled by Virtual Button","authors":"Shijie Xing;Shiqi Luo;Kai Wang","doi":"10.1109/OJID.2025.3606863","DOIUrl":"https://doi.org/10.1109/OJID.2025.3606863","url":null,"abstract":"Virtual buttons on touch screens have assumed an increasingly prominent role within the field of human-computer interaction, functioning as a critical interface modality between users and digital systems. Following their emergence as a widely adopted alternative to traditional input devices such as the mouse and keyboard, virtual buttons have been extensively implemented across a broad spectrum of sectors, including manufacturing, healthcare, finance, and education. Their key advantages—including compact physical design, minimal hardware requirements, ease of system integration, and a relatively low user learning threshold—contribute to their suitability for both consumer-oriented products and domain-specific industrial applications. This paper presents a comprehensive review of numerous influential studies published over recent decades, offering a structured synthesis of the technological progression and functional evolution of virtual button systems. It provides a comparative analysis of various implementations grounded in distinct touch-sensing technologies, such as capacitive, resistive, piezoelectric, and infrared mechanisms, and evaluates their applicability in contexts including consumer electronics and in-vehicle interfaces. The discussion concludes with an examination of emerging trends and prospective development trajectories aimed at addressing the growing demand for intelligent, adaptive virtual button solutions.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"71-80"},"PeriodicalIF":0.0,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11152562","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145141683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Immersive Technologies are an increasingly prevalent field, employed by a plethora of portable and stationary solutions. Ongoing research continues to unlock new possibilities for their use, improving Human-Machine Interaction. In the health sector, such technologies have the potential to optimize the living of individuals with special needs, like the visually impaired. SLAM is a technique used in robotics and computer vision for achieving environmental understanding by building a vector map of the frontal space. It is the fundamental for developing applications that achieve immersive experiences to the user, such as Augmented Reality applications. The ability of these applications to understand their surroundings and the diverse methodologies studied over the years has led to the proposal of efficient techniques for extracting depth data from a camera feed, without the need of a depth sensor. This work introduces a system that exploits the Depth Estimation capabilities of SLAM to detect obstacles and cliffs in the user’s frontal environment. A mobile application was developed, that retrieves the camera feed and generates scene Depth-Maps, before importing them to a newly designed algorithm for obstacle and cliff identification. Audio and haptic feedback is used for warnings and usability notifications. The system was fine-tuned and tested in both indoor and outdoor spaces and quantitative and qualitative results were captured. The goal of the study is to present the development of a tool that can be executed on commodity mobile devices in real-time and it can enhance safety facilitating movement in both indoor and outdoor environments.
{"title":"On the Utilization of SLAM for Obstacle Detection in Commodity Mobile Devices","authors":"Dionysios Koulouris;Orestis Zaras;Andreas Menychtas;Panayiotis Tsanakas;Ilias Maglogiannis","doi":"10.1109/OJID.2025.3592064","DOIUrl":"https://doi.org/10.1109/OJID.2025.3592064","url":null,"abstract":"Immersive Technologies are an increasingly prevalent field, employed by a plethora of portable and stationary solutions. Ongoing research continues to unlock new possibilities for their use, improving Human-Machine Interaction. In the health sector, such technologies have the potential to optimize the living of individuals with special needs, like the visually impaired. SLAM is a technique used in robotics and computer vision for achieving environmental understanding by building a vector map of the frontal space. It is the fundamental for developing applications that achieve immersive experiences to the user, such as Augmented Reality applications. The ability of these applications to understand their surroundings and the diverse methodologies studied over the years has led to the proposal of efficient techniques for extracting depth data from a camera feed, without the need of a depth sensor. This work introduces a system that exploits the Depth Estimation capabilities of SLAM to detect obstacles and cliffs in the user’s frontal environment. A mobile application was developed, that retrieves the camera feed and generates scene Depth-Maps, before importing them to a newly designed algorithm for obstacle and cliff identification. Audio and haptic feedback is used for warnings and usability notifications. The system was fine-tuned and tested in both indoor and outdoor spaces and quantitative and qualitative results were captured. The goal of the study is to present the development of a tool that can be executed on commodity mobile devices in real-time and it can enhance safety facilitating movement in both indoor and outdoor environments.","PeriodicalId":100634,"journal":{"name":"IEEE Open Journal on Immersive Displays","volume":"2 ","pages":"42-54"},"PeriodicalIF":0.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096572","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144868345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}