Fatigued driving is one of the main causes of traffic accidents. In order to improve the detection speed of fatigue driving recognition, this paper proposes a driver fatigue detection method based on multi-parameter fusion of facial features. It uses a cascaded Adaboost object classifier to detect faces in video streams. The DliB library is employed for facial key point detection, which locates the driver's eyes and mouth to determine their states. The eye aspect ratio (EAR) is calculated to detect eye closure, and the mouth aspect ratio (MAR) is calculated to detect yawning frequency and count. The detected percentage of eye closure (PERCLOS) value is combined with yawning frequency and count, and a multi-feature fusion approach is used for fatigue detection. Experimental results show that the accuracy of blink detection is 91% and the accuracy of yawn detection is 96.43%. Furthermore, compared to the models mentioned in the comparative experiments, this model achieves two to four times faster detection times while maintaining accuracy.
{"title":"Multi-parameter fusion driver fatigue detection method based on facial fatigue features","authors":"Xuejing Du, Chengyin Yu, Tianyi Sun","doi":"10.1002/jsid.1343","DOIUrl":"10.1002/jsid.1343","url":null,"abstract":"<p>Fatigued driving is one of the main causes of traffic accidents. In order to improve the detection speed of fatigue driving recognition, this paper proposes a driver fatigue detection method based on multi-parameter fusion of facial features. It uses a cascaded Adaboost object classifier to detect faces in video streams. The DliB library is employed for facial key point detection, which locates the driver's eyes and mouth to determine their states. The eye aspect ratio (EAR) is calculated to detect eye closure, and the mouth aspect ratio (MAR) is calculated to detect yawning frequency and count. The detected percentage of eye closure (PERCLOS) value is combined with yawning frequency and count, and a multi-feature fusion approach is used for fatigue detection. Experimental results show that the accuracy of blink detection is 91% and the accuracy of yawn detection is 96.43%. Furthermore, compared to the models mentioned in the comparative experiments, this model achieves two to four times faster detection times while maintaining accuracy.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 9","pages":"676-690"},"PeriodicalIF":1.7,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141810202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mixed reality technology can be applied to simulation training, improve surgical performance, enhance 3D game experience, and so on, attracting extensive attention of researchers. The perception of the user about head-mounted display MR using HoloLens is critical, especially for precision applications such as virtual hoisting. Designing and adding appropriate depth cues in MR scenes is an effective way to improve users' depth perception. In this study, taking virtual hoisting training system as an example, the depth perception strategy of multi-cue fusion is proposed to improve the perception effect. Based on the mechanism of human depth perception, five kinds of depth cues are designed. The depth perception effect of adding single cue is studied by perceptual matching experiment. Based on the principle of fuzzy clustering, a multiple-cue comprehensive depth optimization strategy on viewing distance scale is proposed. Finally, the perceptual matching results demonstrate the effectiveness of the multi-cue fusion strategy, and the average error is reduced by 20.68% compared with the single-cue strategy, which can significantly improve the spatial depth perception. This research can provide a reference for improving users' depth perception in interactive MR simulation systems.
混合现实技术可应用于模拟训练、提高手术性能、增强 3D 游戏体验等,受到研究人员的广泛关注。用户对使用 HoloLens 的头戴式显示 MR 的感知至关重要,尤其是在虚拟吊装等精密应用中。在磁共振场景中设计和添加适当的深度线索是提高用户深度知觉的有效方法。本研究以虚拟吊装训练系统为例,提出了多线索融合的深度感知策略,以提高感知效果。根据人类深度知觉的机理,设计了五种深度线索。通过感知匹配实验研究了添加单一线索的深度感知效果。基于模糊聚类原理,提出了视距尺度上的多线索综合深度优化策略。最后,感知匹配结果证明了多线索融合策略的有效性,与单线索策略相比,平均误差降低了 20.68%,可以显著改善空间深度感知。这项研究可为提高交互式磁共振模拟系统中用户的深度感知能力提供参考。
{"title":"Depth perception optimization of mixed reality simulation systems based on multiple-cue fusion","authors":"Wei Wang, Tong Chen, Haiping Liu, Jiali Zhang, Qingli Wang, Qinsheng Jiang","doi":"10.1002/jsid.1341","DOIUrl":"10.1002/jsid.1341","url":null,"abstract":"<p>Mixed reality technology can be applied to simulation training, improve surgical performance, enhance 3D game experience, and so on, attracting extensive attention of researchers. The perception of the user about head-mounted display MR using HoloLens is critical, especially for precision applications such as virtual hoisting. Designing and adding appropriate depth cues in MR scenes is an effective way to improve users' depth perception. In this study, taking virtual hoisting training system as an example, the depth perception strategy of multi-cue fusion is proposed to improve the perception effect. Based on the mechanism of human depth perception, five kinds of depth cues are designed. The depth perception effect of adding single cue is studied by perceptual matching experiment. Based on the principle of fuzzy clustering, a multiple-cue comprehensive depth optimization strategy on viewing distance scale is proposed. Finally, the perceptual matching results demonstrate the effectiveness of the multi-cue fusion strategy, and the average error is reduced by 20.68% compared with the single-cue strategy, which can significantly improve the spatial depth perception. This research can provide a reference for improving users' depth perception in interactive MR simulation systems.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 8","pages":"568-579"},"PeriodicalIF":1.7,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141818130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, techniques for accelerating rendering by exploiting the limitations of the human visual system have become increasingly prevalent. The foveated rendering method significantly reduces the computational requirements during rendering by reducing image quality in peripheral regions. In this paper, we propose a scene-content-sensitive real-time adaptive foveated rendering method. First, we pre-render the three-dimensional (3D) scene at a low resolution. Then, we utilize the low-resolution pre-rendered image as input to extract edge, local contrast, and color features. Subsequently, we generate a screen-space region division map based on the gaze point position. Next, we calculate the visual importance of each 16 × 16 pixel tile based on edge, local contrast, color, and screen-space region. We then map the visual importance to the shading rate to generate a shading rate control map for the current frame. Finally, we complete the rendering of the current frame based on variable rate shading technology. Experimental results demonstrate that our method effectively enhances the visual quality of images near the foveal region while generating high quality foveal region images. Furthermore, our method can significantly improve performance compared to per-pixel shading method and existing scene-content-based foveated rendering methods.
{"title":"Scene-content-sensitive real-time adaptive foveated rendering","authors":"Chuanyu Shen, Chunyi Chen, Xiaojuan Hu","doi":"10.1002/jsid.1346","DOIUrl":"10.1002/jsid.1346","url":null,"abstract":"<p>In recent years, techniques for accelerating rendering by exploiting the limitations of the human visual system have become increasingly prevalent. The foveated rendering method significantly reduces the computational requirements during rendering by reducing image quality in peripheral regions. In this paper, we propose a scene-content-sensitive real-time adaptive foveated rendering method. First, we pre-render the three-dimensional (3D) scene at a low resolution. Then, we utilize the low-resolution pre-rendered image as input to extract edge, local contrast, and color features. Subsequently, we generate a screen-space region division map based on the gaze point position. Next, we calculate the visual importance of each 16 × 16 pixel tile based on edge, local contrast, color, and screen-space region. We then map the visual importance to the shading rate to generate a shading rate control map for the current frame. Finally, we complete the rendering of the current frame based on variable rate shading technology. Experimental results demonstrate that our method effectively enhances the visual quality of images near the foveal region while generating high quality foveal region images. Furthermore, our method can significantly improve performance compared to per-pixel shading method and existing scene-content-based foveated rendering methods.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 10","pages":"703-715"},"PeriodicalIF":1.7,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141649817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In field operations with large machines, onsite large task spaces have been typically required. In these cases, remote skilled workers have to provide a step-by-step guide by observing the onsite situation around the large task spaces in three dimensions, instructing onsite unskilled workers on how to perform the tasks with the correct hand gestures at the right position, and switching between observation and instruction frequently during the remote support. In this study, we developed a Metaverse-based remote support system with a seamless user interface for switching between free viewpoint observation and hand gesture instruction. The proposed system enables remote skilled workers to observe the onsite field from any viewpoint, transfer hand gesture instruction to onsite workers, and seamlessly switch between free viewpoint observation and free hand gesture instruction without having to change devices. We compared the time efficiency of the proposed system and a conventional system through experiments with 28 users and found that our system improved the observation time efficiency by 65.7%, the instruction time efficiency by 27.9%, and the switching time efficiency between observation and instruction by 14.6%. These results indicate that the proposed system enables remote skilled workers to support onsite workers quickly and efficiently.
{"title":"Metaverse-based remote support system with smooth combination of free viewpoint observation and hand gesture instruction","authors":"Takashi Numata, Yuya Ogi, Keiichi Mitani, Kazuyuki Tajima, Yusuke Nakamura, Naohito Ikeda, Kenichi Shimada","doi":"10.1002/jsid.1339","DOIUrl":"10.1002/jsid.1339","url":null,"abstract":"<p>In field operations with large machines, onsite large task spaces have been typically required. In these cases, remote skilled workers have to provide a step-by-step guide by observing the onsite situation around the large task spaces in three dimensions, instructing onsite unskilled workers on how to perform the tasks with the correct hand gestures at the right position, and switching between observation and instruction frequently during the remote support. In this study, we developed a Metaverse-based remote support system with a seamless user interface for switching between free viewpoint observation and hand gesture instruction. The proposed system enables remote skilled workers to observe the onsite field from any viewpoint, transfer hand gesture instruction to onsite workers, and seamlessly switch between free viewpoint observation and free hand gesture instruction without having to change devices. We compared the time efficiency of the proposed system and a conventional system through experiments with 28 users and found that our system improved the observation time efficiency by 65.7%, the instruction time efficiency by 27.9%, and the switching time efficiency between observation and instruction by 14.6%. These results indicate that the proposed system enables remote skilled workers to support onsite workers quickly and efficiently.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 9","pages":"653-664"},"PeriodicalIF":1.7,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jsid.1339","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141650665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We fabricated a microdisplay with a 1.50-in. organic light-emitting diode (OLED) and a pixel density as high as 3207 ppi. An ideal display with high luminance, low power consumption, an ultrahigh aperture ratio, and a wide color gamut can be fabricated using a metal maskless lithography technology for patterning OLED layers and an oxide semiconductor large-scale integration (OSLSI)/silicon LSI backplane. We designed a virtual reality device that exhibited less ghosting and a field of view of 90° or greater by combining the microdisplay and a novel pancake lens.
{"title":"VR device with high resolution, high luminance, and low power consumption using 1.50-in. organic light-emitting diode display","authors":"Hisao Ikeda, Ryo Hatsumi, Yuki Tamatsukuri, Shoki Miyata, Daiki Nakamura, Munehiro Kozuma, Hidetomo Kobayashi, Yasumasa Yamane, Sachiko Yamagata, Yousuke Tsukamoto, Shunpei Yamazaki","doi":"10.1002/jsid.1345","DOIUrl":"10.1002/jsid.1345","url":null,"abstract":"<p>We fabricated a microdisplay with a 1.50-in. organic light-emitting diode (OLED) and a pixel density as high as 3207 ppi. An ideal display with high luminance, low power consumption, an ultrahigh aperture ratio, and a wide color gamut can be fabricated using a metal maskless lithography technology for patterning OLED layers and an oxide semiconductor large-scale integration (OSLSI)/silicon LSI backplane. We designed a virtual reality device that exhibited less ghosting and a field of view of 90° or greater by combining the microdisplay and a novel pancake lens.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 8","pages":"580-591"},"PeriodicalIF":1.7,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jsid.1345","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141655412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongwoo Lee, Kiwon Choi, Hyeryoung Park, Yong Ju Kim, Kookhyun Choi, Min Jae Ko
As ultra-high-definition displays have gained popularity, mitigating the horizontal crosstalk effect in 8K LCD panels is crucial. High display resolution requires narrower signal line integration, intensifying the coupling effect. Traditional methods like Vcom feedback and increasing analog voltage drain-drain (AVDD) load are slower and less accurate, leading to increased power consumption. In response, we propose an advanced digital signal compensation method. In this study, we developed a predictive model and investigated the intricate relationships among AVDD, Vcom, and storage capacity (Cst) ripples on horizontal crosstalk. Optimizing the ripple change (∆G) by varying the compensation coefficient (a) and decay ratio (τ) significantly reduces crosstalk effects. The digital compensation method allows rapid and precise compensation without delays, reducing horizontal crosstalk in 8K LCD panels from 4% to below 0.9%. This surpasses the requirement of minimizing crosstalk to less than 2%, substantially enhancing the image quality of high-resolution displays.
{"title":"Digital horizontal crosstalk compensation in 8K LCD displays for enhanced image quality","authors":"Yongwoo Lee, Kiwon Choi, Hyeryoung Park, Yong Ju Kim, Kookhyun Choi, Min Jae Ko","doi":"10.1002/jsid.1342","DOIUrl":"10.1002/jsid.1342","url":null,"abstract":"<p>As ultra-high-definition displays have gained popularity, mitigating the horizontal crosstalk effect in 8K LCD panels is crucial. High display resolution requires narrower signal line integration, intensifying the coupling effect. Traditional methods like Vcom feedback and increasing analog voltage drain-drain (AVDD) load are slower and less accurate, leading to increased power consumption. In response, we propose an advanced digital signal compensation method. In this study, we developed a predictive model and investigated the intricate relationships among AVDD, Vcom, and storage capacity (Cst) ripples on horizontal crosstalk. Optimizing the ripple change (<i>∆G</i>) by varying the compensation coefficient (<i>a</i>) and decay ratio (<i>τ</i>) significantly reduces crosstalk effects. The digital compensation method allows rapid and precise compensation without delays, reducing horizontal crosstalk in 8K LCD panels from 4% to below 0.9%. This surpasses the requirement of minimizing crosstalk to less than 2%, substantially enhancing the image quality of high-resolution displays.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 9","pages":"665-675"},"PeriodicalIF":1.7,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141670497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bai-Chuan Zhao, Wei Jia, Wei Fan, Fan Yang, Yang Fu
A wide-viewing-angle dual-view integral imaging display is proposed. A micro-lens array, a polarizer parallax barrier, and a display panel are aligned in sequence. The display panel is covered with the polarizer parallax barrier. Two types of orthogonal polarizer slits in the polarizer parallax barrier are alternately aligned. Two types of orthogonal polarizer slits polarize the lights from two types of elemental images. The micro-lens array propagates two types of polarized lights into two primary viewing zones, which coincide at the optimal viewing distance. Different 3D images are, respectively, observed through two types of polarizer glasses. The viewing angle is enhanced and unrelated to the number of elemental images.
{"title":"Wide-viewing-angle dual-view integral imaging display","authors":"Bai-Chuan Zhao, Wei Jia, Wei Fan, Fan Yang, Yang Fu","doi":"10.1002/jsid.1344","DOIUrl":"10.1002/jsid.1344","url":null,"abstract":"<p>A wide-viewing-angle dual-view integral imaging display is proposed. A micro-lens array, a polarizer parallax barrier, and a display panel are aligned in sequence. The display panel is covered with the polarizer parallax barrier. Two types of orthogonal polarizer slits in the polarizer parallax barrier are alternately aligned. Two types of orthogonal polarizer slits polarize the lights from two types of elemental images. The micro-lens array propagates two types of polarized lights into two primary viewing zones, which coincide at the optimal viewing distance. Different 3D images are, respectively, observed through two types of polarizer glasses. The viewing angle is enhanced and unrelated to the number of elemental images.</p>","PeriodicalId":49979,"journal":{"name":"Journal of the Society for Information Display","volume":"32 10","pages":"697-702"},"PeriodicalIF":1.7,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141678658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}