首页 > 最新文献

Displays最新文献

英文 中文
DefocusSR2: An efficient depth-guided and distillation-based framework for defocus images super-resolution DefocusSR2:基于深度引导和蒸馏的高效离焦图像超分辨率框架
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-15 DOI: 10.1016/j.displa.2024.102883
Qipei Li, Da Pan, Zefeng Ying, Qirong Liang, Ping Shi
Existing image super-resolution (SR) methods often lead to oversharpening, particularly in defocused images. However, we have observed that defocused regions and focused regions present different levels of recovery difficulty. This observation opens up opportunities for more efficient enhancements. In this paper, we introduce DefocusSR2, an efficient framework designed for super-resolution of defocused images. DefocusSR2 consists of two main modules: Depth-Guided Segmentation (DGS) and Defocus-Aware Classify Enhance (DCE). In the DGS module, we utilize MobileSAM, guided by depth information, to accurately segment the input image and generate defocus maps. These maps provide detailed information about the locations of defocused areas. In the DCE module, we crop the defocus map and classify the segments into defocused and focused patches based on a predefined threshold. Through knowledge distillation and the fusion of blur kernel matching, the network retains the fuzzy kernel to reduce computational load. Practically, the defocused patches are fed into the Efficient Blur Match SR Network (EBM-SR), where the blur kernel is preserved to alleviate computational demands. The focused patches, on the other hand, are processed using more computationally intensive operations. Thus, DefocusSR2 integrates defocus classification and super-resolution within a unified framework. Experiments demonstrate that DefocusSR2 can accelerate most SR methods, reducing the FLOPs of SR models by approximately 70% while maintaining state-of-the-art SR performance.
现有的图像超分辨率(SR)方法通常会导致过度锐化,尤其是在失焦图像中。然而,我们观察到,散焦区域和聚焦区域的恢复难度不同。这一观察结果为更有效的增强提供了机会。在本文中,我们介绍了 DefocusSR2,这是一个专为失焦图像超分辨率设计的高效框架。DefocusSR2 由两个主要模块组成:深度引导分割(DGS)和失焦感知分类增强(DCE)。在 DGS 模块中,我们利用 MobileSAM,在深度信息的引导下,对输入图像进行精确分割,并生成离焦地图。这些地图提供了有关散焦区域位置的详细信息。在 DCE 模块中,我们会裁剪散焦图,并根据预定义的阈值将分段划分为散焦斑块和聚焦斑块。通过知识提炼和模糊内核匹配的融合,网络保留了模糊内核,以减少计算负荷。实际上,失焦斑块被送入高效模糊匹配 SR 网络(EBM-SR),其中保留了模糊内核,以减轻计算需求。另一方面,聚焦补丁的处理需要使用更多计算密集型操作。因此,DefocusSR2 在一个统一的框架内集成了离焦分类和超分辨率。实验证明,DefocusSR2 可以加速大多数 SR 方法,将 SR 模型的 FLOPs 减少约 70%,同时保持最先进的 SR 性能。
{"title":"DefocusSR2: An efficient depth-guided and distillation-based framework for defocus images super-resolution","authors":"Qipei Li,&nbsp;Da Pan,&nbsp;Zefeng Ying,&nbsp;Qirong Liang,&nbsp;Ping Shi","doi":"10.1016/j.displa.2024.102883","DOIUrl":"10.1016/j.displa.2024.102883","url":null,"abstract":"<div><div>Existing image super-resolution (SR) methods often lead to oversharpening, particularly in defocused images. However, we have observed that defocused regions and focused regions present different levels of recovery difficulty. This observation opens up opportunities for more efficient enhancements. In this paper, we introduce DefocusSR2, an efficient framework designed for super-resolution of defocused images. DefocusSR2 consists of two main modules: Depth-Guided Segmentation (DGS) and Defocus-Aware Classify Enhance (DCE). In the DGS module, we utilize MobileSAM, guided by depth information, to accurately segment the input image and generate defocus maps. These maps provide detailed information about the locations of defocused areas. In the DCE module, we crop the defocus map and classify the segments into defocused and focused patches based on a predefined threshold. Through knowledge distillation and the fusion of blur kernel matching, the network retains the fuzzy kernel to reduce computational load. Practically, the defocused patches are fed into the Efficient Blur Match SR Network (EBM-SR), where the blur kernel is preserved to alleviate computational demands. The focused patches, on the other hand, are processed using more computationally intensive operations. Thus, DefocusSR2 integrates defocus classification and super-resolution within a unified framework. Experiments demonstrate that DefocusSR2 can accelerate most SR methods, reducing the FLOPs of SR models by approximately 70% while maintaining state-of-the-art SR performance.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"86 ","pages":"Article 102883"},"PeriodicalIF":3.7,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142706179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices Mambav3d:基于曼巴的虚拟三维模块,在各层医学图像切片之间串联语义信息
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-15 DOI: 10.1016/j.displa.2024.102890
Xiaoxiao Liu, Yan Zhao, Shigang Wang, Jian Wei
High-precision medical image segmentation provides a reliable basis for clinical analysis and diagnosis. Researchers have developed various models to enhance the segmentation performance of medical images. Among these methods, two-dimensional models such as Unet exhibit a simple structure, low computational resource requirements, and strong local feature capture capabilities. However, their spatial information utilization is insufficient, limiting their segmentation accuracy. Three-dimensional models, such as 3D Unet, utilize spatial information more fully and are suitable for complex tasks, but they require high computational resources and have limited real-time performance. In this paper, we propose a virtual 3D module (Mambav3d) based on mamba, which introduces spatial information into 2D segmentation tasks to more fully integrate the 3D information of the image and further improve segmentation accuracy under conditions of low computational resource requirements. Mambav3d leverages the properties of hidden states in the state space model, combined with the shift of visual perspective, to incorporate semantic information between different anatomical planes in different slices of the same 3D sample. The voxel segmentation is converted to pixel segmentation to reduce model training data requirements and model complexity while ensuring that the model integrates 3D information and enhances segmentation accuracy. The model references the information from previous layers when labeling the current layer, thereby facilitating the transfer of semantic information between slice layers and avoiding the high computational cost associated with using structures such as Transformers between layers. We have implemented Mambav3d on Unet and evaluated its performance on the BraTs, Amos, and KiTs datasets, demonstrating superiority over other state-of-the-art methods.
高精度医学图像分割为临床分析和诊断提供了可靠的依据。研究人员开发了各种模型来提高医学图像的分割性能。在这些方法中,Unet 等二维模型结构简单、计算资源要求低、局部特征捕捉能力强。但其空间信息利用率不足,限制了其分割精度。三维模型(如三维 Unet)能更充分地利用空间信息,适用于复杂的任务,但对计算资源的要求较高,实时性有限。本文提出了一种基于 mamba 的虚拟三维模块(Mambav3d),将空间信息引入二维分割任务中,从而更充分地整合图像的三维信息,在低计算资源要求的条件下进一步提高分割精度。Mambav3d 利用状态空间模型中隐藏状态的特性,结合视觉视角的偏移,在同一三维样本的不同切片中加入不同解剖平面之间的语义信息。体素分割转换为像素分割,以减少模型训练数据需求和模型复杂度,同时确保模型整合三维信息并提高分割准确性。该模型在标注当前层时参考了前一层的信息,从而促进了切片层之间语义信息的传递,避免了在层与层之间使用变换器等结构所带来的高计算成本。我们在 Unet 上实现了 Mambav3d,并在 BraTs、Amos 和 KiTs 数据集上对其性能进行了评估,结果表明它优于其他最先进的方法。
{"title":"Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices","authors":"Xiaoxiao Liu,&nbsp;Yan Zhao,&nbsp;Shigang Wang,&nbsp;Jian Wei","doi":"10.1016/j.displa.2024.102890","DOIUrl":"10.1016/j.displa.2024.102890","url":null,"abstract":"<div><div>High-precision medical image segmentation provides a reliable basis for clinical analysis and diagnosis. Researchers have developed various models to enhance the segmentation performance of medical images. Among these methods, two-dimensional models such as Unet exhibit a simple structure, low computational resource requirements, and strong local feature capture capabilities. However, their spatial information utilization is insufficient, limiting their segmentation accuracy. Three-dimensional models, such as 3D Unet, utilize spatial information more fully and are suitable for complex tasks, but they require high computational resources and have limited real-time performance. In this paper, we propose a virtual 3D module (Mambav3d) based on mamba, which introduces spatial information into 2D segmentation tasks to more fully integrate the 3D information of the image and further improve segmentation accuracy under conditions of low computational resource requirements. Mambav3d leverages the properties of hidden states in the state space model, combined with the shift of visual perspective, to incorporate semantic information between different anatomical planes in different slices of the same 3D sample. The voxel segmentation is converted to pixel segmentation to reduce model training data requirements and model complexity while ensuring that the model integrates 3D information and enhances segmentation accuracy. The model references the information from previous layers when labeling the current layer, thereby facilitating the transfer of semantic information between slice layers and avoiding the high computational cost associated with using structures such as Transformers between layers. We have implemented Mambav3d on Unet and evaluated its performance on the BraTs, Amos, and KiTs datasets, demonstrating superiority over other state-of-the-art methods.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102890"},"PeriodicalIF":3.7,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment 基于亮度分解和变换器的无参考色调映射图像质量评估
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-14 DOI: 10.1016/j.displa.2024.102881
Zikang Chen , Zhouyan He , Ting Luo , Chongchong Jin , Yang Song
Tone-Mapping Operators (TMOs) play a crucial role in converting High Dynamic Range (HDR) images into Tone-Mapped Images (TMIs) with standard dynamic range for optimal display on standard monitors. Nevertheless, TMIs generated by distinct TMOs may exhibit diverse visual artifacts, highlighting the significance of TMI Quality Assessment (TMIQA) methods in predicting perceptual quality and guiding advancements in TMOs. Inspired by luminance decomposition and Transformer, a new no-reference TMIQA method based on deep learning is proposed in this paper, named LDT-TMIQA. Specifically, a TMI will change under the influence of different TMOs, potentially resulting in either over-exposure or under-exposure, leading to structure distortion and changes in texture details. Therefore, we first decompose the luminance channel of a TMI into a base layer and a detail layer that capture structure information and texture information, respectively. Then, they are employed with the TMI collectively as inputs to the Feature Extraction Module (FEM) to enhance the availability of prior information on luminance, structure, and texture. Additionally, the FEM incorporates the Cross Attention Prior Module (CAPM) to model the interdependencies among the base layer, detail layer, and TMI while employing the Iterative Attention Prior Module (IAPM) to extract multi-scale and multi-level visual features. Finally, a Feature Selection Fusion Module (FSFM) is proposed to obtain final effective features for predicting the quality scores of TMIs by reducing the weight of unnecessary features and fusing the features of different levels with equal importance. Extensive experiments on the publicly available TMI benchmark database indicate that the proposed LDT-TMIQA reaches the state-of-the-art level.
阶调映射操作器(TMO)在将高动态范围(HDR)图像转换为具有标准动态范围的阶调映射图像(TMI)以在标准显示器上实现最佳显示效果方面发挥着至关重要的作用。然而,由不同 TMO 生成的 TMI 可能会表现出不同的视觉效果,这就凸显了 TMI 质量评估(TMIQA)方法在预测感知质量和指导 TMO 技术进步方面的重要性。受亮度分解和变换器的启发,本文提出了一种基于深度学习的全新无参照 TMIQA 方法,命名为 LDT-TMIQA。具体来说,TMI 在不同 TMO 的影响下会发生变化,可能导致曝光过度或曝光不足,从而导致结构失真和纹理细节的变化。因此,我们首先将 TMI 的亮度通道分解为基础层和细节层,分别捕捉结构信息和纹理信息。然后,将它们与 TMI 一起作为特征提取模块(FEM)的输入,以提高亮度、结构和纹理先验信息的可用性。此外,FEM 还结合了交叉注意先验模块 (CAPM),以模拟基础层、细节层和 TMI 之间的相互依存关系,同时采用迭代注意先验模块 (IAPM) 来提取多尺度和多层次的视觉特征。最后,提出了一个特征选择融合模块(FSFM),通过减少不必要特征的权重和融合不同层次的同等重要特征,获得预测 TMI 质量得分的最终有效特征。在公开的 TMI 基准数据库上进行的大量实验表明,所提出的 LDT-TMIQA 达到了最先进的水平。
{"title":"Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment","authors":"Zikang Chen ,&nbsp;Zhouyan He ,&nbsp;Ting Luo ,&nbsp;Chongchong Jin ,&nbsp;Yang Song","doi":"10.1016/j.displa.2024.102881","DOIUrl":"10.1016/j.displa.2024.102881","url":null,"abstract":"<div><div>Tone-Mapping Operators (TMOs) play a crucial role in converting High Dynamic Range (HDR) images into Tone-Mapped Images (TMIs) with standard dynamic range for optimal display on standard monitors. Nevertheless, TMIs generated by distinct TMOs may exhibit diverse visual artifacts, highlighting the significance of TMI Quality Assessment (TMIQA) methods in predicting perceptual quality and guiding advancements in TMOs. Inspired by luminance decomposition and Transformer, a new no-reference TMIQA method based on deep learning is proposed in this paper, named LDT-TMIQA. Specifically, a TMI will change under the influence of different TMOs, potentially resulting in either over-exposure or under-exposure, leading to structure distortion and changes in texture details. Therefore, we first decompose the luminance channel of a TMI into a base layer and a detail layer that capture structure information and texture information, respectively. Then, they are employed with the TMI collectively as inputs to the Feature Extraction Module (FEM) to enhance the availability of prior information on luminance, structure, and texture. Additionally, the FEM incorporates the Cross Attention Prior Module (CAPM) to model the interdependencies among the base layer, detail layer, and TMI while employing the Iterative Attention Prior Module (IAPM) to extract multi-scale and multi-level visual features. Finally, a Feature Selection Fusion Module (FSFM) is proposed to obtain final effective features for predicting the quality scores of TMIs by reducing the weight of unnecessary features and fusing the features of different levels with equal importance. Extensive experiments on the publicly available TMI benchmark database indicate that the proposed LDT-TMIQA reaches the state-of-the-art level.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102881"},"PeriodicalIF":3.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Precise subpixel luminance extraction method for De-Mura of AMOLED displays 用于 AMOLED 显示器去村效应的精确亚像素亮度提取方法
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-14 DOI: 10.1016/j.displa.2024.102889
Zhong Zheng, Zhaohua Zhou, Ruipeng Chen, Jiajie Liu, Chun Liu, Lirong Zhang, Lei Zhou, Miao Xu, Lei Wang, Weijing Wu, Junbiao Peng
Currently, Mura defects have a significant impact on the yield of AMOLED panels, and De-Mura plays a critical role in the compensation. To enhance the applicability of the subpixel luminance extraction method in De-Mura and to address inaccuracies caused by aperture diffraction limit and geometric defocusing in camera imaging, this paper proposes a precise extraction method based on effective area. We establish the concept of the effective area first and then determine the effective area of subpixel imaging on the camera sensor by incorporating the circle of confusion (CoC) caused by aperture diffraction limits and geometric defocusing. Finally, more precise luminance information is obtained. Results show that, after compensation, the Mura on the white screen is almost eliminated subjectively. Objectively, by constructing normalized luminance curves for subpixels in Mura regions, the standard deviation indicates that our method outperforms the traditional whole-pixel method, improving uniformity by approximately 50%.
目前,Mura 缺陷对 AMOLED 面板的良品率有很大影响,而 De-Mura 在补偿中起着至关重要的作用。为了提高亚像素亮度提取方法在去村斑中的适用性,并解决相机成像中光圈衍射极限和几何散焦造成的误差,本文提出了一种基于有效面积的精确提取方法。我们首先建立了有效面积的概念,然后结合光圈衍射极限和几何散焦造成的混淆圈(CoC),确定摄像机传感器上子像素成像的有效面积。最后,获得更精确的亮度信息。结果表明,经过补偿后,白色屏幕上的穆拉现象在主观上几乎被消除。客观上,通过构建 Mura 区域子像素的归一化亮度曲线,标准偏差表明我们的方法优于传统的全像素方法,均匀度提高了约 50%。
{"title":"Precise subpixel luminance extraction method for De-Mura of AMOLED displays","authors":"Zhong Zheng,&nbsp;Zhaohua Zhou,&nbsp;Ruipeng Chen,&nbsp;Jiajie Liu,&nbsp;Chun Liu,&nbsp;Lirong Zhang,&nbsp;Lei Zhou,&nbsp;Miao Xu,&nbsp;Lei Wang,&nbsp;Weijing Wu,&nbsp;Junbiao Peng","doi":"10.1016/j.displa.2024.102889","DOIUrl":"10.1016/j.displa.2024.102889","url":null,"abstract":"<div><div>Currently, Mura defects have a significant impact on the yield of AMOLED panels, and De-Mura plays a critical role in the compensation. To enhance the applicability of the subpixel luminance extraction method in De-Mura and to address inaccuracies caused by aperture diffraction limit and geometric defocusing in camera imaging, this paper proposes a precise extraction method based on effective area. We establish the concept of the effective area first and then determine the effective area of subpixel imaging on the camera sensor by incorporating the circle of confusion (CoC) caused by aperture diffraction limits and geometric defocusing. Finally, more precise luminance information is obtained. Results show that, after compensation, the Mura on the white screen is almost eliminated subjectively. Objectively, by constructing normalized luminance curves for subpixels in Mura regions, the standard deviation indicates that our method outperforms the traditional whole-pixel method, improving uniformity by approximately 50%.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"86 ","pages":"Article 102889"},"PeriodicalIF":3.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142706177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Font and background color combinations influence recognition efficiency: A novel method via primary color Euclidean distance and response surface analysis 字体和背景颜色组合会影响识别效率:通过原色欧氏距离和响应面分析的新方法
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-12 DOI: 10.1016/j.displa.2024.102873
Wenchao Zhu , Zeliang Cheng , Qi Wang , Jing Du , Yingzi Lin
The readability of human–computer interfaces impacts the users’ visual performance while using electronic devices, which gains inadequate attention. This situation is critical during high-stress conditions such as firefighting, where accurate and fast information processing is critical. This study addresses how font and background color combinations on Liquid Crystal displays (LCDs) affect recognition efficiency. A novel concept, primary color Euclidean distance (PCED), is introduced and testified under a repeated-measures experiment. Three factors were investigated: background color (black, white), font color (red, green, blue), and PCEDs. A total of 24 participants were recruited. Results demonstrate that color combinations with specific PCED values can substantially impact recognition efficiency. By using RSA, this study modelled the response time in a generalized mathematical model, which is response surface analysis. Results showed that blue font colors under a black background showed the longest response time. This study also explored the influence of physical stress on recognition efficiency, revealing a latency of about 100 ms across all color combinations. The findings offer a methodological advancement in understanding the effects of color combinations in digital displays, setting the stage for future research in diverse demographic and technological contexts, including mixed reality.
人机界面的可读性会影响用户在使用电子设备时的视觉表现,从而导致注意力不足。在消防等高压力条件下,这种情况至关重要,因为准确快速的信息处理至关重要。本研究探讨了液晶显示器(LCD)上的字体和背景颜色组合如何影响识别效率。研究引入了一个新概念--原色欧氏距离(PCED),并在重复测量实验中进行了验证。实验研究了三个因素:背景颜色(黑、白)、字体颜色(红、绿、蓝)和 PCED。共招募了 24 名参与者。结果表明,特定 PCED 值的颜色组合会极大地影响识别效率。通过使用 RSA,本研究在广义数学模型(即响应面分析)中对响应时间进行了建模。结果显示,黑色背景下蓝色字体的响应时间最长。本研究还探讨了身体压力对识别效率的影响,发现所有颜色组合的延迟时间都在 100 毫秒左右。研究结果为理解数字显示屏中颜色组合的影响提供了一种方法论上的进步,为今后在不同的人口和技术背景下(包括混合现实)开展研究奠定了基础。
{"title":"Font and background color combinations influence recognition efficiency: A novel method via primary color Euclidean distance and response surface analysis","authors":"Wenchao Zhu ,&nbsp;Zeliang Cheng ,&nbsp;Qi Wang ,&nbsp;Jing Du ,&nbsp;Yingzi Lin","doi":"10.1016/j.displa.2024.102873","DOIUrl":"10.1016/j.displa.2024.102873","url":null,"abstract":"<div><div>The readability of human–computer interfaces impacts the users’ visual performance while using electronic devices, which gains inadequate attention. This situation is critical during high-stress conditions such as firefighting, where accurate and fast information processing is critical. This study addresses how font and background color combinations on Liquid Crystal displays (LCDs) affect recognition efficiency. A novel concept, primary color Euclidean distance (PCED), is introduced and testified under a repeated-measures experiment. Three factors were investigated: background color (black, white), font color (red, green, blue), and PCEDs. A total of 24 participants were recruited. Results demonstrate that color combinations with specific PCED values can substantially impact recognition efficiency. By using RSA, this study modelled the response time in a generalized mathematical model, which is response surface analysis. Results showed that blue font colors under a black background showed the longest response time. This study also explored the influence of physical stress on recognition efficiency, revealing a latency of about 100 ms across all color combinations. The findings offer a methodological advancement in understanding the effects of color combinations in digital displays, setting the stage for future research in diverse demographic and technological contexts, including mixed reality.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102873"},"PeriodicalIF":3.7,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142706427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment GLDBF:用于无参照点云质量评估的全局和局部双分支融合网络
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-09 DOI: 10.1016/j.displa.2024.102882
Zhichao Chen , Shuyu Xiao , Yongfang Wang , Yihan Wang , Hongming Cai
No-reference Point Cloud Quality Assessment (NR-PCQA) is a challenge in the field of media quality assessment, such as inability to accurately capture quality-related features due to the unique scattered structure of points and less considering global features and local features jointly in the existing no-reference PCQA metrics. To address these challenges, we propose a Global and Local Dual-Branch Fusion (GLDBF) network for no-reference point cloud quality assessment. Firstly, sparse convolution is used to extract the global quality feature of distorted Point Clouds (PCs). Secondly, graph weighted PointNet++ is proposed to extract the multi-level local features of point cloud, and the offset attention mechanism is further used to enhance local effective features. Transformer-based fusion module is also proposed to fuse multi-level local features. Finally, we joint the global and local dual branch fusion modules via multilayer perceptron to predict the quality score of distorted PCs. Experimental results show that the proposed algorithm can achieves state-of-the-art performance compared with existing methods in assessing the quality of distorted PCs.
无参照点云质量评估(NR-PCQA)是媒体质量评估领域的一项挑战,例如,由于点的结构比较分散,无法准确捕捉与质量相关的特征,而且现有的无参照 PCQA 指标较少同时考虑全局特征和局部特征。针对这些挑战,我们提出了一种用于无参考点云质量评估的全局和局部双分支融合(GLDBF)网络。首先,使用稀疏卷积来提取扭曲点云(PC)的全局质量特征。其次,提出了图加权 PointNet++ 来提取点云的多级局部特征,并进一步使用偏移注意机制来增强局部有效特征。此外,还提出了基于变换器的融合模块来融合多级局部特征。最后,我们通过多层感知器将全局和局部双分支融合模块联合起来,预测失真 PC 的质量得分。实验结果表明,在评估失真 PC 的质量方面,与现有方法相比,所提出的算法可以达到最先进的性能。
{"title":"GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment","authors":"Zhichao Chen ,&nbsp;Shuyu Xiao ,&nbsp;Yongfang Wang ,&nbsp;Yihan Wang ,&nbsp;Hongming Cai","doi":"10.1016/j.displa.2024.102882","DOIUrl":"10.1016/j.displa.2024.102882","url":null,"abstract":"<div><div>No-reference Point Cloud Quality Assessment (NR-PCQA) is a challenge in the field of media quality assessment, such as inability to accurately capture quality-related features due to the unique scattered structure of points and less considering global features and local features jointly in the existing no-reference PCQA metrics. To address these challenges, we propose a Global and Local Dual-Branch Fusion (GLDBF) network for no-reference point cloud quality assessment. Firstly, sparse convolution is used to extract the global quality feature of distorted Point Clouds (PCs). Secondly, graph weighted PointNet++ is proposed to extract the multi-level local features of point cloud, and the offset attention mechanism is further used to enhance local effective features. Transformer-based fusion module is also proposed to fuse multi-level local features. Finally, we joint the global and local dual branch fusion modules via multilayer perceptron to predict the quality score of distorted PCs. Experimental results show that the proposed algorithm can achieves state-of-the-art performance compared with existing methods in assessing the quality of distorted PCs.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102882"},"PeriodicalIF":3.7,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI 利用可解释人工智能辅助加权集合深度学习方法对结肠镜图像中的胃肠道疾病进行分类
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-06 DOI: 10.1016/j.displa.2024.102874
Faruk Enes Oğuz , Ahmet Alkan
Gastrointestinal diseases are significant health issues worldwide, requiring early diagnosis due to their serious health implications. Therefore, detecting these diseases using artificial intelligence-based medical decision support systems through colonoscopy images plays a critical role in early diagnosis. In this study, a deep learning-based method is proposed for the classification of gastrointestinal diseases and colon anatomical landmarks using colonoscopy images. For this purpose, five different Convolutional Neural Network (CNN) models, namely Xception, ResNet-101, NASNet-Large, EfficientNet, and NASNet-Mobile, were trained. An ensemble model was created using class-based recall values derived from the validation performances of the top three models (Xception, ResNet-101, NASNet-Large). A user-friendly Graphical User Interface (GUI) was developed, allowing users to perform classification tasks and use Gradient-weighted Class Activation Mapping (Grad-CAM), an explainable AI tool, to visualize the regions from which the model derives information. Grad-CAM visualizations contribute to a better understanding of the model’s decision-making processes and play an important role in the application of explainable AI. In the study, eight labels, including anatomical markers such as z-line, pylorus, and cecum, as well as pathological findings like esophagitis, polyps, and ulcerative colitis, were classified using the KVASIR V2 dataset. The proposed ensemble model achieved a 94.125% accuracy on the KVASIR V2 dataset, demonstrating competitive performance compared to similar studies in the literature. Additionally, the precision and F1 score values ​​of this model are equal to 94.168% and 94.125%, respectively. These results suggest that the proposed method provides an effective solution for the diagnosis of GI diseases and can be beneficial for medical education.
胃肠道疾病是世界范围内的重大健康问题,因其对健康的严重影响而需要早期诊断。因此,利用基于人工智能的医疗决策支持系统通过结肠镜图像检测这些疾病在早期诊断中发挥着至关重要的作用。本研究提出了一种基于深度学习的方法,利用结肠镜图像对胃肠道疾病和结肠解剖地标进行分类。为此,我们训练了五个不同的卷积神经网络(CNN)模型,即 Xception、ResNet-101、NASNet-Large、EfficientNet 和 NASNet-Mobile。根据前三个模型(Xception、ResNet-101、NASNet-Large)的验证性能得出的基于类的召回值,创建了一个集合模型。我们开发了一个用户友好型图形用户界面(GUI),允许用户执行分类任务,并使用梯度加权类激活映射(Grad-CAM)这一可解释的人工智能工具来可视化模型从中获取信息的区域。Grad-CAM 可视化有助于更好地理解模型的决策过程,并在可解释人工智能的应用中发挥重要作用。在这项研究中,利用 KVASIR V2 数据集对八个标签进行了分类,包括 Z 线、幽门和盲肠等解剖标记以及食管炎、息肉和溃疡性结肠炎等病理结果。所提出的集合模型在 KVASIR V2 数据集上达到了 94.125% 的准确率,与文献中的类似研究相比,表现出了很强的竞争力。此外,该模型的精确度和 F1 分数分别为 94.168% 和 94.125%。这些结果表明,所提出的方法为消化道疾病的诊断提供了有效的解决方案,并可用于医学教育。
{"title":"Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI","authors":"Faruk Enes Oğuz ,&nbsp;Ahmet Alkan","doi":"10.1016/j.displa.2024.102874","DOIUrl":"10.1016/j.displa.2024.102874","url":null,"abstract":"<div><div>Gastrointestinal diseases are significant health issues worldwide, requiring early diagnosis due to their serious health implications. Therefore, detecting these diseases using artificial intelligence-based medical decision support systems through colonoscopy images plays a critical role in early diagnosis. In this study, a deep learning-based method is proposed for the classification of gastrointestinal diseases and colon anatomical landmarks using colonoscopy images. For this purpose, five different Convolutional Neural Network (CNN) models, namely Xception, ResNet-101, NASNet-Large, EfficientNet, and NASNet-Mobile, were trained. An ensemble model was created using class-based recall values derived from the validation performances of the top three models (Xception, ResNet-101, NASNet-Large). A user-friendly Graphical User Interface (GUI) was developed, allowing users to perform classification tasks and use Gradient-weighted Class Activation Mapping (Grad-CAM), an explainable AI tool, to visualize the regions from which the model derives information. Grad-CAM visualizations contribute to a better understanding of the model’s decision-making processes and play an important role in the application of explainable AI. In the study, eight labels, including anatomical markers such as z-line, pylorus, and cecum, as well as pathological findings like esophagitis, polyps, and ulcerative colitis, were classified using the KVASIR V2 dataset. The proposed ensemble model achieved a 94.125% accuracy on the KVASIR V2 dataset, demonstrating competitive performance compared to similar studies in the literature. Additionally, the precision and F1 score values ​​of this model are equal to 94.168% and 94.125%, respectively. These results suggest that the proposed method provides an effective solution for the diagnosis of GI diseases and can be beneficial for medical education.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102874"},"PeriodicalIF":3.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches 医学教育中的虚拟现实技术:沉浸式虚拟解剖实验室(IVAL)与传统学习方法的效果比较
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-06 DOI: 10.1016/j.displa.2024.102870
Mohammed Kadri , Fatima-Ezzahra Boubakri , Timothy Teo , Fatima-Zahra Kaghat , Ahmed Azough , Khalid Alaoui Zidani
Immersive Virtual Anatomy Laboratory (IVAL) is an innovative learning tool that combines virtual reality and serious games elements to enhance anatomy education. This experimental study compares IVAL with traditional learning methods in terms of educational effectiveness and user acceptance. An experimental design was implemented with 120 undergraduate health-science students, randomly assigned to two groups: an experimental group using IVAL, and a control group following traditional learning methods. Data collection focused on quantitative measures such as pretest and posttest vocabulary assessment scores and task completion times, alongside qualitative measures obtained through a user experience questionnaire. This study utilizes the Technology Acceptance Model (TAM), incorporating variables such as Perceived Usefulness and Perceived Ease of Use. Results revealed significant improvements in the experimental group, with a 55.95% increase in vocabulary scores and an 18.75% reduction in task completion times compared to the control group. Qualitative data indicated that IVAL users reported greater Perceived Usefulness of the technology, improved Perceived Ease of Use, a more positive Attitude Towards Using IVAL, and stronger Behavioral Intention to continue using IVAL for anatomy learning. This study demonstrates that the integration of immersive virtual reality in the IVAL approach offers a promising method to enhance anatomy education. The findings provide insights into the effectiveness of immersive learning environments in improving learning outcomes and user acceptance. While further research is needed to explore long-term effects, this innovative approach not only enhances the effectiveness and enjoyment of anatomy learning but also provides valuable data on optimizing educational technology for improved learning outcomes.
沉浸式虚拟解剖实验室(IVAL)是一种创新的学习工具,它结合了虚拟现实和严肃游戏元素,以加强解剖学教育。本实验研究比较了 IVAL 与传统学习方法在教育效果和用户接受度方面的差异。研究采用实验设计,将 120 名健康科学专业的本科生随机分配到两组:使用 IVAL 的实验组和采用传统学习方法的对照组。数据收集的重点是前测和后测词汇量评估分数和任务完成时间等定量指标,以及通过用户体验问卷获得的定性指标。本研究采用了技术接受模型(TAM),将感知有用性和感知易用性等变量纳入其中。结果显示,与对照组相比,实验组有明显改善,词汇量得分提高了 55.95%,任务完成时间缩短了 18.75%。定性数据表明,IVAL 用户对该技术的 "感知有用性 "更高,"感知易用性 "得到改善,对使用 IVAL 持更积极的态度,并有更强烈的行为意向继续使用 IVAL 进行解剖学学习。这项研究表明,在 IVAL 方法中整合沉浸式虚拟现实技术为加强解剖学教育提供了一种很有前景的方法。研究结果为身临其境的学习环境在提高学习效果和用户接受度方面的有效性提供了启示。虽然还需要进一步的研究来探索长期效果,但这种创新方法不仅提高了解剖学学习的效果和乐趣,还为优化教育技术以提高学习效果提供了宝贵的数据。
{"title":"Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches","authors":"Mohammed Kadri ,&nbsp;Fatima-Ezzahra Boubakri ,&nbsp;Timothy Teo ,&nbsp;Fatima-Zahra Kaghat ,&nbsp;Ahmed Azough ,&nbsp;Khalid Alaoui Zidani","doi":"10.1016/j.displa.2024.102870","DOIUrl":"10.1016/j.displa.2024.102870","url":null,"abstract":"<div><div>Immersive Virtual Anatomy Laboratory (IVAL) is an innovative learning tool that combines virtual reality and serious games elements to enhance anatomy education. This experimental study compares IVAL with traditional learning methods in terms of educational effectiveness and user acceptance. An experimental design was implemented with 120 undergraduate health-science students, randomly assigned to two groups: an experimental group using IVAL, and a control group following traditional learning methods. Data collection focused on quantitative measures such as pretest and posttest vocabulary assessment scores and task completion times, alongside qualitative measures obtained through a user experience questionnaire. This study utilizes the Technology Acceptance Model (TAM), incorporating variables such as Perceived Usefulness and Perceived Ease of Use. Results revealed significant improvements in the experimental group, with a 55.95% increase in vocabulary scores and an 18.75% reduction in task completion times compared to the control group. Qualitative data indicated that IVAL users reported greater Perceived Usefulness of the technology, improved Perceived Ease of Use, a more positive Attitude Towards Using IVAL, and stronger Behavioral Intention to continue using IVAL for anatomy learning. This study demonstrates that the integration of immersive virtual reality in the IVAL approach offers a promising method to enhance anatomy education. The findings provide insights into the effectiveness of immersive learning environments in improving learning outcomes and user acceptance. While further research is needed to explore long-term effects, this innovative approach not only enhances the effectiveness and enjoyment of anatomy learning but also provides valuable data on optimizing educational technology for improved learning outcomes.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102870"},"PeriodicalIF":3.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CIFTC-Net: Cross information fusion network with transformer and CNN for polyp segmentation CIFTC-Net:用于息肉分割的带有变压器和 CNN 的交叉信息融合网络
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-02 DOI: 10.1016/j.displa.2024.102872
Xinyu Li , Qiaohong Liu , Xuewei Li , Tiansheng Huang , Min Lin , Xiaoxiang Han , Weikun Zhang , Keyan Chen , Yuanjie Lin
Polyp segmentation plays a crucial role in the early diagnosis and treatment of colorectal cancer, which is the third most common cancer worldwide. Despite remarkable successes achieved by recent deep learning-related works, accurate segmentation of polyps remains challenging due to the diversity in their shapes, sizes, appearances, and other factors. To address these problems, a novel cross information fusion network with Transformer and convolutional neural network (CNN) for polyp segmentation, named CIFTC-Net, is proposed to improve the segmentation performance of colon polyps. In particular, a dual-branch encoder with Pyramid Vision Transformer (PVT) and ResNet50 is employed to take full advantage of both the global semantic information and local spatial features to enhance the feature representation ability. To effectively fuse the two types of features, a new global–local feature fusion (GLFF) module is designed. Additionally, in the PVT branch, a multi-scale feature integration (MSFI) module is introduced to fuse multi-scale features adaptively. At the bottom of the model, a multi-scale atrous pyramid bridging (MSAPB) module is proposed to achieve rich and robust multi-level features and improve the segmentation accuracy. Experimental results on four public polyp segmentation datasets demonstrate that CIFTC-Net surpasses current state-of-the-art methods across various metrics, showcasing its superiority in segmentation accuracy, generalization ability, and handling of complex images.
息肉是全球第三大常见癌症,息肉分割在结直肠癌的早期诊断和治疗中起着至关重要的作用。尽管最近的深度学习相关研究取得了令人瞩目的成就,但由于息肉的形状、大小、外观和其他因素的多样性,对息肉进行精确分割仍然具有挑战性。为解决这些问题,我们提出了一种用于息肉分割的新型交叉信息融合网络,该网络名为 CIFTC-Net,旨在提高结肠息肉的分割性能。其中,采用了 Pyramid Vision Transformer(PVT)和 ResNet50 的双分支编码器,充分利用全局语义信息和局部空间特征来增强特征表示能力。为了有效融合这两类特征,设计了一个新的全局-局部特征融合(GLFF)模块。此外,在 PVT 分支中还引入了多尺度特征融合(MSFI)模块,用于自适应地融合多尺度特征。在模型的底层,提出了多尺度无规金字塔桥接(MSAPB)模块,以实现丰富而稳健的多层次特征,提高分割精度。在四个公共息肉分割数据集上的实验结果表明,CIFTC-Net 在各种指标上都超越了目前最先进的方法,展示了其在分割精度、泛化能力和处理复杂图像方面的优势。
{"title":"CIFTC-Net: Cross information fusion network with transformer and CNN for polyp segmentation","authors":"Xinyu Li ,&nbsp;Qiaohong Liu ,&nbsp;Xuewei Li ,&nbsp;Tiansheng Huang ,&nbsp;Min Lin ,&nbsp;Xiaoxiang Han ,&nbsp;Weikun Zhang ,&nbsp;Keyan Chen ,&nbsp;Yuanjie Lin","doi":"10.1016/j.displa.2024.102872","DOIUrl":"10.1016/j.displa.2024.102872","url":null,"abstract":"<div><div>Polyp segmentation plays a crucial role in the early diagnosis and treatment of colorectal cancer, which is the third most common cancer worldwide. Despite remarkable successes achieved by recent deep learning-related works, accurate segmentation of polyps remains challenging due to the diversity in their shapes, sizes, appearances, and other factors. To address these problems, a novel cross information fusion network with Transformer and convolutional neural network (CNN) for polyp segmentation, named CIFTC-Net, is proposed to improve the segmentation performance of colon polyps. In particular, a dual-branch encoder with Pyramid Vision Transformer (PVT) and ResNet50 is employed to take full advantage of both the global semantic information and local spatial features to enhance the feature representation ability. To effectively fuse the two types of features, a new global–local feature fusion (GLFF) module is designed. Additionally, in the PVT branch, a multi-scale feature integration (MSFI) module is introduced to fuse multi-scale features adaptively. At the bottom of the model, a multi-scale atrous pyramid bridging (MSAPB) module is proposed to achieve rich and robust multi-level features and improve the segmentation accuracy. Experimental results on four public polyp segmentation datasets demonstrate that CIFTC-Net surpasses current state-of-the-art methods across various metrics, showcasing its superiority in segmentation accuracy, generalization ability, and handling of complex images.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102872"},"PeriodicalIF":3.7,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From hardware to software integration: A comparative study of usability and safety in vehicle interaction modes 从硬件到软件集成:车辆交互模式的可用性和安全性比较研究
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-30 DOI: 10.1016/j.displa.2024.102869
Haibo Yin , Rui Li , Yingjie Victor Chen
The increasing advancement of human–machine interaction (HMI) technology has brought the modes of vehicle HMI into focus, as they are closely related to driver and passenger safety and directly affect the travel experiences. This study compared the usability and safety of three vehicle HMI modes: hardware interaction (HI), hardware and software interaction (HSI), and software interaction (SI). The evaluation comprised two dimensions: usability and safety. Sixty participants’ performance on these tasks was evaluated at two driving speeds (30 km/h and 60 km/h). The results of the nonparametric tests indicated significant differences between the three interaction modes: (1) HI was the highest safety-oriented interaction mode with participants had the highest average vehicle speed and maximum acceleration measured at 60 km/h and the lowest glance frequency at both speeds; (2) HSI was the most usable interaction mode. Participants had the shortest task-completion time measured at 60 km/h and the highest score on the NASA-TLX and SUS scales taken for both speeds; (3) SI was the lowest secure and usable in-vehicle interaction mode. Participants had the longest task-completion time at 60 km/h, the highest error frequency under 30 and 60 km/h and the highest glance frequency, the longest total glance duration and the longest average glance time. In conclusion, HI and HSI were more secure and usable in-vehicle interaction modes than SI. From a theoretical exploration perspective, this paper elaborates on some exploratory thoughts and innovative ideas for practical application to the screen HMI mode selection and design in intelligent vehicle cabins.
人机交互(HMI)技术的不断进步使车辆人机交互模式成为关注的焦点,因为这些模式与驾驶员和乘客的安全息息相关,并直接影响出行体验。本研究比较了三种车载人机交互模式的可用性和安全性:硬件交互(HI)、软硬件交互(HSI)和软件交互(SI)。评估包括两个方面:可用性和安全性。在两种驾驶速度(30 公里/小时和 60 公里/小时)下,对 60 名参与者在这些任务中的表现进行了评估。非参数检验结果表明,三种交互模式之间存在显著差异:(1) HI 是安全性最高的交互模式,参与者在 60 km/h 时的平均车速和最大加速度最高,在这两种车速下的注视频率最低;(2) HSI 是可用性最高的交互模式。以 60 km/h 的时速测量,参与者完成任务的时间最短,两种速度下的 NASA-TLX 和 SUS 量表得分最高;(3) SI 是安全性和可用性最低的车内交互模式。参试者在 60 公里/小时时完成任务的时间最长,在 30 公里/小时和 60 公里/小时时出错频率最高,注视频率最高,总注视时间最长,平均注视时间最长。总之,HI 和 HSI 是比 SI 更安全、更可用的车内交互模式。本文从理论探索的角度,阐述了在智能车载驾驶室屏幕人机交互模式选择和设计方面的一些探索性思考和创新性想法,供实际应用。
{"title":"From hardware to software integration: A comparative study of usability and safety in vehicle interaction modes","authors":"Haibo Yin ,&nbsp;Rui Li ,&nbsp;Yingjie Victor Chen","doi":"10.1016/j.displa.2024.102869","DOIUrl":"10.1016/j.displa.2024.102869","url":null,"abstract":"<div><div>The increasing advancement of human–machine interaction (HMI) technology has brought the modes of vehicle HMI into focus, as they are closely related to driver and passenger safety and directly affect the travel experiences. This study compared the usability and safety of three vehicle HMI modes: hardware interaction (HI), hardware and software interaction (HSI), and software interaction (SI). The evaluation comprised two dimensions: usability and safety. Sixty participants’ performance on these tasks was evaluated at two driving speeds (30 km/h and 60 km/h). The results of the nonparametric tests indicated significant differences between the three interaction modes: (1) HI was the highest safety-oriented interaction mode with participants had the highest average vehicle speed and maximum acceleration measured at 60 km/h and the lowest glance frequency at both speeds; (2) HSI was the most usable interaction mode. Participants had the shortest task-completion time measured at 60 km/h and the highest score on the NASA-TLX and SUS scales taken for both speeds; (3) SI was the lowest secure and usable in-vehicle interaction mode. Participants had the longest task-completion time at 60 km/h, the highest error frequency under 30 and 60 km/h and the highest glance frequency, the longest total glance duration and the longest average glance time. In conclusion, HI and HSI were more secure and usable in-vehicle interaction modes than SI. From a theoretical exploration perspective, this paper elaborates on some exploratory thoughts and innovative ideas for practical application to the screen HMI mode selection and design in intelligent vehicle cabins.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102869"},"PeriodicalIF":3.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Displays
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1