首页 > 最新文献

Displays最新文献

英文 中文
Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices Mambav3d:基于曼巴的虚拟三维模块,在各层医学图像切片之间串联语义信息
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-15 DOI: 10.1016/j.displa.2024.102890
Xiaoxiao Liu, Yan Zhao, Shigang Wang, Jian Wei
High-precision medical image segmentation provides a reliable basis for clinical analysis and diagnosis. Researchers have developed various models to enhance the segmentation performance of medical images. Among these methods, two-dimensional models such as Unet exhibit a simple structure, low computational resource requirements, and strong local feature capture capabilities. However, their spatial information utilization is insufficient, limiting their segmentation accuracy. Three-dimensional models, such as 3D Unet, utilize spatial information more fully and are suitable for complex tasks, but they require high computational resources and have limited real-time performance. In this paper, we propose a virtual 3D module (Mambav3d) based on mamba, which introduces spatial information into 2D segmentation tasks to more fully integrate the 3D information of the image and further improve segmentation accuracy under conditions of low computational resource requirements. Mambav3d leverages the properties of hidden states in the state space model, combined with the shift of visual perspective, to incorporate semantic information between different anatomical planes in different slices of the same 3D sample. The voxel segmentation is converted to pixel segmentation to reduce model training data requirements and model complexity while ensuring that the model integrates 3D information and enhances segmentation accuracy. The model references the information from previous layers when labeling the current layer, thereby facilitating the transfer of semantic information between slice layers and avoiding the high computational cost associated with using structures such as Transformers between layers. We have implemented Mambav3d on Unet and evaluated its performance on the BraTs, Amos, and KiTs datasets, demonstrating superiority over other state-of-the-art methods.
高精度医学图像分割为临床分析和诊断提供了可靠的依据。研究人员开发了各种模型来提高医学图像的分割性能。在这些方法中,Unet 等二维模型结构简单、计算资源要求低、局部特征捕捉能力强。但其空间信息利用率不足,限制了其分割精度。三维模型(如三维 Unet)能更充分地利用空间信息,适用于复杂的任务,但对计算资源的要求较高,实时性有限。本文提出了一种基于 mamba 的虚拟三维模块(Mambav3d),将空间信息引入二维分割任务中,从而更充分地整合图像的三维信息,在低计算资源要求的条件下进一步提高分割精度。Mambav3d 利用状态空间模型中隐藏状态的特性,结合视觉视角的偏移,在同一三维样本的不同切片中加入不同解剖平面之间的语义信息。体素分割转换为像素分割,以减少模型训练数据需求和模型复杂度,同时确保模型整合三维信息并提高分割准确性。该模型在标注当前层时参考了前一层的信息,从而促进了切片层之间语义信息的传递,避免了在层与层之间使用变换器等结构所带来的高计算成本。我们在 Unet 上实现了 Mambav3d,并在 BraTs、Amos 和 KiTs 数据集上对其性能进行了评估,结果表明它优于其他最先进的方法。
{"title":"Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices","authors":"Xiaoxiao Liu,&nbsp;Yan Zhao,&nbsp;Shigang Wang,&nbsp;Jian Wei","doi":"10.1016/j.displa.2024.102890","DOIUrl":"10.1016/j.displa.2024.102890","url":null,"abstract":"<div><div>High-precision medical image segmentation provides a reliable basis for clinical analysis and diagnosis. Researchers have developed various models to enhance the segmentation performance of medical images. Among these methods, two-dimensional models such as Unet exhibit a simple structure, low computational resource requirements, and strong local feature capture capabilities. However, their spatial information utilization is insufficient, limiting their segmentation accuracy. Three-dimensional models, such as 3D Unet, utilize spatial information more fully and are suitable for complex tasks, but they require high computational resources and have limited real-time performance. In this paper, we propose a virtual 3D module (Mambav3d) based on mamba, which introduces spatial information into 2D segmentation tasks to more fully integrate the 3D information of the image and further improve segmentation accuracy under conditions of low computational resource requirements. Mambav3d leverages the properties of hidden states in the state space model, combined with the shift of visual perspective, to incorporate semantic information between different anatomical planes in different slices of the same 3D sample. The voxel segmentation is converted to pixel segmentation to reduce model training data requirements and model complexity while ensuring that the model integrates 3D information and enhances segmentation accuracy. The model references the information from previous layers when labeling the current layer, thereby facilitating the transfer of semantic information between slice layers and avoiding the high computational cost associated with using structures such as Transformers between layers. We have implemented Mambav3d on Unet and evaluated its performance on the BraTs, Amos, and KiTs datasets, demonstrating superiority over other state-of-the-art methods.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102890"},"PeriodicalIF":3.7,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment 基于亮度分解和变换器的无参考色调映射图像质量评估
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-14 DOI: 10.1016/j.displa.2024.102881
Zikang Chen , Zhouyan He , Ting Luo , Chongchong Jin , Yang Song
Tone-Mapping Operators (TMOs) play a crucial role in converting High Dynamic Range (HDR) images into Tone-Mapped Images (TMIs) with standard dynamic range for optimal display on standard monitors. Nevertheless, TMIs generated by distinct TMOs may exhibit diverse visual artifacts, highlighting the significance of TMI Quality Assessment (TMIQA) methods in predicting perceptual quality and guiding advancements in TMOs. Inspired by luminance decomposition and Transformer, a new no-reference TMIQA method based on deep learning is proposed in this paper, named LDT-TMIQA. Specifically, a TMI will change under the influence of different TMOs, potentially resulting in either over-exposure or under-exposure, leading to structure distortion and changes in texture details. Therefore, we first decompose the luminance channel of a TMI into a base layer and a detail layer that capture structure information and texture information, respectively. Then, they are employed with the TMI collectively as inputs to the Feature Extraction Module (FEM) to enhance the availability of prior information on luminance, structure, and texture. Additionally, the FEM incorporates the Cross Attention Prior Module (CAPM) to model the interdependencies among the base layer, detail layer, and TMI while employing the Iterative Attention Prior Module (IAPM) to extract multi-scale and multi-level visual features. Finally, a Feature Selection Fusion Module (FSFM) is proposed to obtain final effective features for predicting the quality scores of TMIs by reducing the weight of unnecessary features and fusing the features of different levels with equal importance. Extensive experiments on the publicly available TMI benchmark database indicate that the proposed LDT-TMIQA reaches the state-of-the-art level.
阶调映射操作器(TMO)在将高动态范围(HDR)图像转换为具有标准动态范围的阶调映射图像(TMI)以在标准显示器上实现最佳显示效果方面发挥着至关重要的作用。然而,由不同 TMO 生成的 TMI 可能会表现出不同的视觉效果,这就凸显了 TMI 质量评估(TMIQA)方法在预测感知质量和指导 TMO 技术进步方面的重要性。受亮度分解和变换器的启发,本文提出了一种基于深度学习的全新无参照 TMIQA 方法,命名为 LDT-TMIQA。具体来说,TMI 在不同 TMO 的影响下会发生变化,可能导致曝光过度或曝光不足,从而导致结构失真和纹理细节的变化。因此,我们首先将 TMI 的亮度通道分解为基础层和细节层,分别捕捉结构信息和纹理信息。然后,将它们与 TMI 一起作为特征提取模块(FEM)的输入,以提高亮度、结构和纹理先验信息的可用性。此外,FEM 还结合了交叉注意先验模块 (CAPM),以模拟基础层、细节层和 TMI 之间的相互依存关系,同时采用迭代注意先验模块 (IAPM) 来提取多尺度和多层次的视觉特征。最后,提出了一个特征选择融合模块(FSFM),通过减少不必要特征的权重和融合不同层次的同等重要特征,获得预测 TMI 质量得分的最终有效特征。在公开的 TMI 基准数据库上进行的大量实验表明,所提出的 LDT-TMIQA 达到了最先进的水平。
{"title":"Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment","authors":"Zikang Chen ,&nbsp;Zhouyan He ,&nbsp;Ting Luo ,&nbsp;Chongchong Jin ,&nbsp;Yang Song","doi":"10.1016/j.displa.2024.102881","DOIUrl":"10.1016/j.displa.2024.102881","url":null,"abstract":"<div><div>Tone-Mapping Operators (TMOs) play a crucial role in converting High Dynamic Range (HDR) images into Tone-Mapped Images (TMIs) with standard dynamic range for optimal display on standard monitors. Nevertheless, TMIs generated by distinct TMOs may exhibit diverse visual artifacts, highlighting the significance of TMI Quality Assessment (TMIQA) methods in predicting perceptual quality and guiding advancements in TMOs. Inspired by luminance decomposition and Transformer, a new no-reference TMIQA method based on deep learning is proposed in this paper, named LDT-TMIQA. Specifically, a TMI will change under the influence of different TMOs, potentially resulting in either over-exposure or under-exposure, leading to structure distortion and changes in texture details. Therefore, we first decompose the luminance channel of a TMI into a base layer and a detail layer that capture structure information and texture information, respectively. Then, they are employed with the TMI collectively as inputs to the Feature Extraction Module (FEM) to enhance the availability of prior information on luminance, structure, and texture. Additionally, the FEM incorporates the Cross Attention Prior Module (CAPM) to model the interdependencies among the base layer, detail layer, and TMI while employing the Iterative Attention Prior Module (IAPM) to extract multi-scale and multi-level visual features. Finally, a Feature Selection Fusion Module (FSFM) is proposed to obtain final effective features for predicting the quality scores of TMIs by reducing the weight of unnecessary features and fusing the features of different levels with equal importance. Extensive experiments on the publicly available TMI benchmark database indicate that the proposed LDT-TMIQA reaches the state-of-the-art level.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102881"},"PeriodicalIF":3.7,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment GLDBF:用于无参照点云质量评估的全局和局部双分支融合网络
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-09 DOI: 10.1016/j.displa.2024.102882
Zhichao Chen , Shuyu Xiao , Yongfang Wang , Yihan Wang , Hongming Cai
No-reference Point Cloud Quality Assessment (NR-PCQA) is a challenge in the field of media quality assessment, such as inability to accurately capture quality-related features due to the unique scattered structure of points and less considering global features and local features jointly in the existing no-reference PCQA metrics. To address these challenges, we propose a Global and Local Dual-Branch Fusion (GLDBF) network for no-reference point cloud quality assessment. Firstly, sparse convolution is used to extract the global quality feature of distorted Point Clouds (PCs). Secondly, graph weighted PointNet++ is proposed to extract the multi-level local features of point cloud, and the offset attention mechanism is further used to enhance local effective features. Transformer-based fusion module is also proposed to fuse multi-level local features. Finally, we joint the global and local dual branch fusion modules via multilayer perceptron to predict the quality score of distorted PCs. Experimental results show that the proposed algorithm can achieves state-of-the-art performance compared with existing methods in assessing the quality of distorted PCs.
无参照点云质量评估(NR-PCQA)是媒体质量评估领域的一项挑战,例如,由于点的结构比较分散,无法准确捕捉与质量相关的特征,而且现有的无参照 PCQA 指标较少同时考虑全局特征和局部特征。针对这些挑战,我们提出了一种用于无参考点云质量评估的全局和局部双分支融合(GLDBF)网络。首先,使用稀疏卷积来提取扭曲点云(PC)的全局质量特征。其次,提出了图加权 PointNet++ 来提取点云的多级局部特征,并进一步使用偏移注意机制来增强局部有效特征。此外,还提出了基于变换器的融合模块来融合多级局部特征。最后,我们通过多层感知器将全局和局部双分支融合模块联合起来,预测失真 PC 的质量得分。实验结果表明,在评估失真 PC 的质量方面,与现有方法相比,所提出的算法可以达到最先进的性能。
{"title":"GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment","authors":"Zhichao Chen ,&nbsp;Shuyu Xiao ,&nbsp;Yongfang Wang ,&nbsp;Yihan Wang ,&nbsp;Hongming Cai","doi":"10.1016/j.displa.2024.102882","DOIUrl":"10.1016/j.displa.2024.102882","url":null,"abstract":"<div><div>No-reference Point Cloud Quality Assessment (NR-PCQA) is a challenge in the field of media quality assessment, such as inability to accurately capture quality-related features due to the unique scattered structure of points and less considering global features and local features jointly in the existing no-reference PCQA metrics. To address these challenges, we propose a Global and Local Dual-Branch Fusion (GLDBF) network for no-reference point cloud quality assessment. Firstly, sparse convolution is used to extract the global quality feature of distorted Point Clouds (PCs). Secondly, graph weighted PointNet++ is proposed to extract the multi-level local features of point cloud, and the offset attention mechanism is further used to enhance local effective features. Transformer-based fusion module is also proposed to fuse multi-level local features. Finally, we joint the global and local dual branch fusion modules via multilayer perceptron to predict the quality score of distorted PCs. Experimental results show that the proposed algorithm can achieves state-of-the-art performance compared with existing methods in assessing the quality of distorted PCs.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102882"},"PeriodicalIF":3.7,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI 利用可解释人工智能辅助加权集合深度学习方法对结肠镜图像中的胃肠道疾病进行分类
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-06 DOI: 10.1016/j.displa.2024.102874
Faruk Enes Oğuz , Ahmet Alkan
Gastrointestinal diseases are significant health issues worldwide, requiring early diagnosis due to their serious health implications. Therefore, detecting these diseases using artificial intelligence-based medical decision support systems through colonoscopy images plays a critical role in early diagnosis. In this study, a deep learning-based method is proposed for the classification of gastrointestinal diseases and colon anatomical landmarks using colonoscopy images. For this purpose, five different Convolutional Neural Network (CNN) models, namely Xception, ResNet-101, NASNet-Large, EfficientNet, and NASNet-Mobile, were trained. An ensemble model was created using class-based recall values derived from the validation performances of the top three models (Xception, ResNet-101, NASNet-Large). A user-friendly Graphical User Interface (GUI) was developed, allowing users to perform classification tasks and use Gradient-weighted Class Activation Mapping (Grad-CAM), an explainable AI tool, to visualize the regions from which the model derives information. Grad-CAM visualizations contribute to a better understanding of the model’s decision-making processes and play an important role in the application of explainable AI. In the study, eight labels, including anatomical markers such as z-line, pylorus, and cecum, as well as pathological findings like esophagitis, polyps, and ulcerative colitis, were classified using the KVASIR V2 dataset. The proposed ensemble model achieved a 94.125% accuracy on the KVASIR V2 dataset, demonstrating competitive performance compared to similar studies in the literature. Additionally, the precision and F1 score values ​​of this model are equal to 94.168% and 94.125%, respectively. These results suggest that the proposed method provides an effective solution for the diagnosis of GI diseases and can be beneficial for medical education.
胃肠道疾病是世界范围内的重大健康问题,因其对健康的严重影响而需要早期诊断。因此,利用基于人工智能的医疗决策支持系统通过结肠镜图像检测这些疾病在早期诊断中发挥着至关重要的作用。本研究提出了一种基于深度学习的方法,利用结肠镜图像对胃肠道疾病和结肠解剖地标进行分类。为此,我们训练了五个不同的卷积神经网络(CNN)模型,即 Xception、ResNet-101、NASNet-Large、EfficientNet 和 NASNet-Mobile。根据前三个模型(Xception、ResNet-101、NASNet-Large)的验证性能得出的基于类的召回值,创建了一个集合模型。我们开发了一个用户友好型图形用户界面(GUI),允许用户执行分类任务,并使用梯度加权类激活映射(Grad-CAM)这一可解释的人工智能工具来可视化模型从中获取信息的区域。Grad-CAM 可视化有助于更好地理解模型的决策过程,并在可解释人工智能的应用中发挥重要作用。在这项研究中,利用 KVASIR V2 数据集对八个标签进行了分类,包括 Z 线、幽门和盲肠等解剖标记以及食管炎、息肉和溃疡性结肠炎等病理结果。所提出的集合模型在 KVASIR V2 数据集上达到了 94.125% 的准确率,与文献中的类似研究相比,表现出了很强的竞争力。此外,该模型的精确度和 F1 分数分别为 94.168% 和 94.125%。这些结果表明,所提出的方法为消化道疾病的诊断提供了有效的解决方案,并可用于医学教育。
{"title":"Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI","authors":"Faruk Enes Oğuz ,&nbsp;Ahmet Alkan","doi":"10.1016/j.displa.2024.102874","DOIUrl":"10.1016/j.displa.2024.102874","url":null,"abstract":"<div><div>Gastrointestinal diseases are significant health issues worldwide, requiring early diagnosis due to their serious health implications. Therefore, detecting these diseases using artificial intelligence-based medical decision support systems through colonoscopy images plays a critical role in early diagnosis. In this study, a deep learning-based method is proposed for the classification of gastrointestinal diseases and colon anatomical landmarks using colonoscopy images. For this purpose, five different Convolutional Neural Network (CNN) models, namely Xception, ResNet-101, NASNet-Large, EfficientNet, and NASNet-Mobile, were trained. An ensemble model was created using class-based recall values derived from the validation performances of the top three models (Xception, ResNet-101, NASNet-Large). A user-friendly Graphical User Interface (GUI) was developed, allowing users to perform classification tasks and use Gradient-weighted Class Activation Mapping (Grad-CAM), an explainable AI tool, to visualize the regions from which the model derives information. Grad-CAM visualizations contribute to a better understanding of the model’s decision-making processes and play an important role in the application of explainable AI. In the study, eight labels, including anatomical markers such as z-line, pylorus, and cecum, as well as pathological findings like esophagitis, polyps, and ulcerative colitis, were classified using the KVASIR V2 dataset. The proposed ensemble model achieved a 94.125% accuracy on the KVASIR V2 dataset, demonstrating competitive performance compared to similar studies in the literature. Additionally, the precision and F1 score values ​​of this model are equal to 94.168% and 94.125%, respectively. These results suggest that the proposed method provides an effective solution for the diagnosis of GI diseases and can be beneficial for medical education.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102874"},"PeriodicalIF":3.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches 医学教育中的虚拟现实技术:沉浸式虚拟解剖实验室(IVAL)与传统学习方法的效果比较
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-06 DOI: 10.1016/j.displa.2024.102870
Mohammed Kadri , Fatima-Ezzahra Boubakri , Timothy Teo , Fatima-Zahra Kaghat , Ahmed Azough , Khalid Alaoui Zidani
Immersive Virtual Anatomy Laboratory (IVAL) is an innovative learning tool that combines virtual reality and serious games elements to enhance anatomy education. This experimental study compares IVAL with traditional learning methods in terms of educational effectiveness and user acceptance. An experimental design was implemented with 120 undergraduate health-science students, randomly assigned to two groups: an experimental group using IVAL, and a control group following traditional learning methods. Data collection focused on quantitative measures such as pretest and posttest vocabulary assessment scores and task completion times, alongside qualitative measures obtained through a user experience questionnaire. This study utilizes the Technology Acceptance Model (TAM), incorporating variables such as Perceived Usefulness and Perceived Ease of Use. Results revealed significant improvements in the experimental group, with a 55.95% increase in vocabulary scores and an 18.75% reduction in task completion times compared to the control group. Qualitative data indicated that IVAL users reported greater Perceived Usefulness of the technology, improved Perceived Ease of Use, a more positive Attitude Towards Using IVAL, and stronger Behavioral Intention to continue using IVAL for anatomy learning. This study demonstrates that the integration of immersive virtual reality in the IVAL approach offers a promising method to enhance anatomy education. The findings provide insights into the effectiveness of immersive learning environments in improving learning outcomes and user acceptance. While further research is needed to explore long-term effects, this innovative approach not only enhances the effectiveness and enjoyment of anatomy learning but also provides valuable data on optimizing educational technology for improved learning outcomes.
沉浸式虚拟解剖实验室(IVAL)是一种创新的学习工具,它结合了虚拟现实和严肃游戏元素,以加强解剖学教育。本实验研究比较了 IVAL 与传统学习方法在教育效果和用户接受度方面的差异。研究采用实验设计,将 120 名健康科学专业的本科生随机分配到两组:使用 IVAL 的实验组和采用传统学习方法的对照组。数据收集的重点是前测和后测词汇量评估分数和任务完成时间等定量指标,以及通过用户体验问卷获得的定性指标。本研究采用了技术接受模型(TAM),将感知有用性和感知易用性等变量纳入其中。结果显示,与对照组相比,实验组有明显改善,词汇量得分提高了 55.95%,任务完成时间缩短了 18.75%。定性数据表明,IVAL 用户对该技术的 "感知有用性 "更高,"感知易用性 "得到改善,对使用 IVAL 持更积极的态度,并有更强烈的行为意向继续使用 IVAL 进行解剖学学习。这项研究表明,在 IVAL 方法中整合沉浸式虚拟现实技术为加强解剖学教育提供了一种很有前景的方法。研究结果为身临其境的学习环境在提高学习效果和用户接受度方面的有效性提供了启示。虽然还需要进一步的研究来探索长期效果,但这种创新方法不仅提高了解剖学学习的效果和乐趣,还为优化教育技术以提高学习效果提供了宝贵的数据。
{"title":"Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches","authors":"Mohammed Kadri ,&nbsp;Fatima-Ezzahra Boubakri ,&nbsp;Timothy Teo ,&nbsp;Fatima-Zahra Kaghat ,&nbsp;Ahmed Azough ,&nbsp;Khalid Alaoui Zidani","doi":"10.1016/j.displa.2024.102870","DOIUrl":"10.1016/j.displa.2024.102870","url":null,"abstract":"<div><div>Immersive Virtual Anatomy Laboratory (IVAL) is an innovative learning tool that combines virtual reality and serious games elements to enhance anatomy education. This experimental study compares IVAL with traditional learning methods in terms of educational effectiveness and user acceptance. An experimental design was implemented with 120 undergraduate health-science students, randomly assigned to two groups: an experimental group using IVAL, and a control group following traditional learning methods. Data collection focused on quantitative measures such as pretest and posttest vocabulary assessment scores and task completion times, alongside qualitative measures obtained through a user experience questionnaire. This study utilizes the Technology Acceptance Model (TAM), incorporating variables such as Perceived Usefulness and Perceived Ease of Use. Results revealed significant improvements in the experimental group, with a 55.95% increase in vocabulary scores and an 18.75% reduction in task completion times compared to the control group. Qualitative data indicated that IVAL users reported greater Perceived Usefulness of the technology, improved Perceived Ease of Use, a more positive Attitude Towards Using IVAL, and stronger Behavioral Intention to continue using IVAL for anatomy learning. This study demonstrates that the integration of immersive virtual reality in the IVAL approach offers a promising method to enhance anatomy education. The findings provide insights into the effectiveness of immersive learning environments in improving learning outcomes and user acceptance. While further research is needed to explore long-term effects, this innovative approach not only enhances the effectiveness and enjoyment of anatomy learning but also provides valuable data on optimizing educational technology for improved learning outcomes.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102870"},"PeriodicalIF":3.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142650755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CIFTC-Net: Cross information fusion network with transformer and CNN for polyp segmentation CIFTC-Net:用于息肉分割的带有变压器和 CNN 的交叉信息融合网络
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-11-02 DOI: 10.1016/j.displa.2024.102872
Xinyu Li , Qiaohong Liu , Xuewei Li , Tiansheng Huang , Min Lin , Xiaoxiang Han , Weikun Zhang , Keyan Chen , Yuanjie Lin
Polyp segmentation plays a crucial role in the early diagnosis and treatment of colorectal cancer, which is the third most common cancer worldwide. Despite remarkable successes achieved by recent deep learning-related works, accurate segmentation of polyps remains challenging due to the diversity in their shapes, sizes, appearances, and other factors. To address these problems, a novel cross information fusion network with Transformer and convolutional neural network (CNN) for polyp segmentation, named CIFTC-Net, is proposed to improve the segmentation performance of colon polyps. In particular, a dual-branch encoder with Pyramid Vision Transformer (PVT) and ResNet50 is employed to take full advantage of both the global semantic information and local spatial features to enhance the feature representation ability. To effectively fuse the two types of features, a new global–local feature fusion (GLFF) module is designed. Additionally, in the PVT branch, a multi-scale feature integration (MSFI) module is introduced to fuse multi-scale features adaptively. At the bottom of the model, a multi-scale atrous pyramid bridging (MSAPB) module is proposed to achieve rich and robust multi-level features and improve the segmentation accuracy. Experimental results on four public polyp segmentation datasets demonstrate that CIFTC-Net surpasses current state-of-the-art methods across various metrics, showcasing its superiority in segmentation accuracy, generalization ability, and handling of complex images.
息肉是全球第三大常见癌症,息肉分割在结直肠癌的早期诊断和治疗中起着至关重要的作用。尽管最近的深度学习相关研究取得了令人瞩目的成就,但由于息肉的形状、大小、外观和其他因素的多样性,对息肉进行精确分割仍然具有挑战性。为解决这些问题,我们提出了一种用于息肉分割的新型交叉信息融合网络,该网络名为 CIFTC-Net,旨在提高结肠息肉的分割性能。其中,采用了 Pyramid Vision Transformer(PVT)和 ResNet50 的双分支编码器,充分利用全局语义信息和局部空间特征来增强特征表示能力。为了有效融合这两类特征,设计了一个新的全局-局部特征融合(GLFF)模块。此外,在 PVT 分支中还引入了多尺度特征融合(MSFI)模块,用于自适应地融合多尺度特征。在模型的底层,提出了多尺度无规金字塔桥接(MSAPB)模块,以实现丰富而稳健的多层次特征,提高分割精度。在四个公共息肉分割数据集上的实验结果表明,CIFTC-Net 在各种指标上都超越了目前最先进的方法,展示了其在分割精度、泛化能力和处理复杂图像方面的优势。
{"title":"CIFTC-Net: Cross information fusion network with transformer and CNN for polyp segmentation","authors":"Xinyu Li ,&nbsp;Qiaohong Liu ,&nbsp;Xuewei Li ,&nbsp;Tiansheng Huang ,&nbsp;Min Lin ,&nbsp;Xiaoxiang Han ,&nbsp;Weikun Zhang ,&nbsp;Keyan Chen ,&nbsp;Yuanjie Lin","doi":"10.1016/j.displa.2024.102872","DOIUrl":"10.1016/j.displa.2024.102872","url":null,"abstract":"<div><div>Polyp segmentation plays a crucial role in the early diagnosis and treatment of colorectal cancer, which is the third most common cancer worldwide. Despite remarkable successes achieved by recent deep learning-related works, accurate segmentation of polyps remains challenging due to the diversity in their shapes, sizes, appearances, and other factors. To address these problems, a novel cross information fusion network with Transformer and convolutional neural network (CNN) for polyp segmentation, named CIFTC-Net, is proposed to improve the segmentation performance of colon polyps. In particular, a dual-branch encoder with Pyramid Vision Transformer (PVT) and ResNet50 is employed to take full advantage of both the global semantic information and local spatial features to enhance the feature representation ability. To effectively fuse the two types of features, a new global–local feature fusion (GLFF) module is designed. Additionally, in the PVT branch, a multi-scale feature integration (MSFI) module is introduced to fuse multi-scale features adaptively. At the bottom of the model, a multi-scale atrous pyramid bridging (MSAPB) module is proposed to achieve rich and robust multi-level features and improve the segmentation accuracy. Experimental results on four public polyp segmentation datasets demonstrate that CIFTC-Net surpasses current state-of-the-art methods across various metrics, showcasing its superiority in segmentation accuracy, generalization ability, and handling of complex images.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102872"},"PeriodicalIF":3.7,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From hardware to software integration: A comparative study of usability and safety in vehicle interaction modes 从硬件到软件集成:车辆交互模式的可用性和安全性比较研究
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-30 DOI: 10.1016/j.displa.2024.102869
Haibo Yin , Rui Li , Yingjie Victor Chen
The increasing advancement of human–machine interaction (HMI) technology has brought the modes of vehicle HMI into focus, as they are closely related to driver and passenger safety and directly affect the travel experiences. This study compared the usability and safety of three vehicle HMI modes: hardware interaction (HI), hardware and software interaction (HSI), and software interaction (SI). The evaluation comprised two dimensions: usability and safety. Sixty participants’ performance on these tasks was evaluated at two driving speeds (30 km/h and 60 km/h). The results of the nonparametric tests indicated significant differences between the three interaction modes: (1) HI was the highest safety-oriented interaction mode with participants had the highest average vehicle speed and maximum acceleration measured at 60 km/h and the lowest glance frequency at both speeds; (2) HSI was the most usable interaction mode. Participants had the shortest task-completion time measured at 60 km/h and the highest score on the NASA-TLX and SUS scales taken for both speeds; (3) SI was the lowest secure and usable in-vehicle interaction mode. Participants had the longest task-completion time at 60 km/h, the highest error frequency under 30 and 60 km/h and the highest glance frequency, the longest total glance duration and the longest average glance time. In conclusion, HI and HSI were more secure and usable in-vehicle interaction modes than SI. From a theoretical exploration perspective, this paper elaborates on some exploratory thoughts and innovative ideas for practical application to the screen HMI mode selection and design in intelligent vehicle cabins.
人机交互(HMI)技术的不断进步使车辆人机交互模式成为关注的焦点,因为这些模式与驾驶员和乘客的安全息息相关,并直接影响出行体验。本研究比较了三种车载人机交互模式的可用性和安全性:硬件交互(HI)、软硬件交互(HSI)和软件交互(SI)。评估包括两个方面:可用性和安全性。在两种驾驶速度(30 公里/小时和 60 公里/小时)下,对 60 名参与者在这些任务中的表现进行了评估。非参数检验结果表明,三种交互模式之间存在显著差异:(1) HI 是安全性最高的交互模式,参与者在 60 km/h 时的平均车速和最大加速度最高,在这两种车速下的注视频率最低;(2) HSI 是可用性最高的交互模式。以 60 km/h 的时速测量,参与者完成任务的时间最短,两种速度下的 NASA-TLX 和 SUS 量表得分最高;(3) SI 是安全性和可用性最低的车内交互模式。参试者在 60 公里/小时时完成任务的时间最长,在 30 公里/小时和 60 公里/小时时出错频率最高,注视频率最高,总注视时间最长,平均注视时间最长。总之,HI 和 HSI 是比 SI 更安全、更可用的车内交互模式。本文从理论探索的角度,阐述了在智能车载驾驶室屏幕人机交互模式选择和设计方面的一些探索性思考和创新性想法,供实际应用。
{"title":"From hardware to software integration: A comparative study of usability and safety in vehicle interaction modes","authors":"Haibo Yin ,&nbsp;Rui Li ,&nbsp;Yingjie Victor Chen","doi":"10.1016/j.displa.2024.102869","DOIUrl":"10.1016/j.displa.2024.102869","url":null,"abstract":"<div><div>The increasing advancement of human–machine interaction (HMI) technology has brought the modes of vehicle HMI into focus, as they are closely related to driver and passenger safety and directly affect the travel experiences. This study compared the usability and safety of three vehicle HMI modes: hardware interaction (HI), hardware and software interaction (HSI), and software interaction (SI). The evaluation comprised two dimensions: usability and safety. Sixty participants’ performance on these tasks was evaluated at two driving speeds (30 km/h and 60 km/h). The results of the nonparametric tests indicated significant differences between the three interaction modes: (1) HI was the highest safety-oriented interaction mode with participants had the highest average vehicle speed and maximum acceleration measured at 60 km/h and the lowest glance frequency at both speeds; (2) HSI was the most usable interaction mode. Participants had the shortest task-completion time measured at 60 km/h and the highest score on the NASA-TLX and SUS scales taken for both speeds; (3) SI was the lowest secure and usable in-vehicle interaction mode. Participants had the longest task-completion time at 60 km/h, the highest error frequency under 30 and 60 km/h and the highest glance frequency, the longest total glance duration and the longest average glance time. In conclusion, HI and HSI were more secure and usable in-vehicle interaction modes than SI. From a theoretical exploration perspective, this paper elaborates on some exploratory thoughts and innovative ideas for practical application to the screen HMI mode selection and design in intelligent vehicle cabins.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102869"},"PeriodicalIF":3.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142572264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-coupled prompt learning for few-shot image recognition 交叉耦合提示学习用于少镜头图像识别
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-30 DOI: 10.1016/j.displa.2024.102862
Fangyuan Zhang , Rukai Wei , Yanzhao Xie , Yangtao Wang , Xin Tan , Lizhuang Ma , Maobin Tang , Lisheng Fan
Prompt learning based on large models shows great potential to reduce training time and resource costs, which has been progressively applied to visual tasks such as image recognition. Nevertheless, the existing prompt learning schemes suffer from either inadequate prompt information from a single modality or insufficient prompt interaction between multiple modalities, resulting in low efficiency and performance. To address these limitations, we propose a Cross-Coupled Prompt Learning (CCPL) architecture, which is designed with two novel components (i.e., Cross-Coupled Prompt Generator (CCPG) module and Cross-Modal Fusion (CMF) module) to achieve efficient interaction between visual and textual prompts. Specifically, the CCPG module incorporates a cross-attention mechanism to automatically generate visual and textual prompts, each of which will be adaptively updated using the self-attention mechanism in their respective image and text encoders. Furthermore, the CMF module implements a deep fusion to reinforce the cross-modal feature interaction from the output layer with the Image–Text Matching (ITM) loss function. We conduct extensive experiments on 8 image datasets. The experimental results verify that our proposed CCPL outperforms the SOTA methods on few-shot image recognition tasks. The source code of this project is released at: https://github.com/elegantTechie/CCPL.
基于大型模型的提示学习在减少训练时间和资源成本方面显示出巨大潜力,并已逐步应用于图像识别等视觉任务中。然而,现有的提示学习方案要么来自单一模态的提示信息不足,要么多种模态之间的提示交互不足,导致效率和性能低下。为了解决这些局限性,我们提出了交叉耦合提示学习(CCPL)架构,该架构设计了两个新颖的组件(即交叉耦合提示生成器(CCPG)模块和交叉模态融合(CMF)模块),以实现视觉提示和文本提示之间的高效交互。具体来说,CCPG 模块采用交叉注意机制自动生成视觉和文本提示,每个提示都将利用各自图像和文本编码器中的自注意机制进行自适应更新。此外,CMF 模块还实现了深度融合,利用图像-文本匹配(ITM)损失函数加强输出层的跨模态特征交互。我们在 8 个图像数据集上进行了广泛的实验。实验结果验证了我们提出的 CCPL 在少量图像识别任务上优于 SOTA 方法。该项目的源代码发布于:https://github.com/elegantTechie/CCPL。
{"title":"Cross-coupled prompt learning for few-shot image recognition","authors":"Fangyuan Zhang ,&nbsp;Rukai Wei ,&nbsp;Yanzhao Xie ,&nbsp;Yangtao Wang ,&nbsp;Xin Tan ,&nbsp;Lizhuang Ma ,&nbsp;Maobin Tang ,&nbsp;Lisheng Fan","doi":"10.1016/j.displa.2024.102862","DOIUrl":"10.1016/j.displa.2024.102862","url":null,"abstract":"<div><div>Prompt learning based on large models shows great potential to reduce training time and resource costs, which has been progressively applied to visual tasks such as image recognition. Nevertheless, the existing prompt learning schemes suffer from either inadequate prompt information from a single modality or insufficient prompt interaction between multiple modalities, resulting in low efficiency and performance. To address these limitations, we propose a <u>C</u>ross-<u>C</u>oupled <u>P</u>rompt <u>L</u>earning (CCPL) architecture, which is designed with two novel components (i.e., Cross-Coupled Prompt Generator (CCPG) module and Cross-Modal Fusion (CMF) module) to achieve efficient interaction between visual and textual prompts. Specifically, the CCPG module incorporates a cross-attention mechanism to automatically generate visual and textual prompts, each of which will be adaptively updated using the self-attention mechanism in their respective image and text encoders. Furthermore, the CMF module implements a deep fusion to reinforce the cross-modal feature interaction from the output layer with the Image–Text Matching (ITM) loss function. We conduct extensive experiments on 8 image datasets. The experimental results verify that our proposed CCPL outperforms the SOTA methods on few-shot image recognition tasks. The source code of this project is released at: <span><span>https://github.com/elegantTechie/CCPL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102862"},"PeriodicalIF":3.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing arbitrary style transfer like an artist 像艺术家一样评估任意的风格转换
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-28 DOI: 10.1016/j.displa.2024.102859
Hangwei Chen, Feng Shao, Baoyang Mu, Qiuping Jiang
Arbitrary style transfer (AST) is a distinctive technique for transferring artistic style into content images, with the goal of generating stylized images that approximates real artistic paintings. Thus, it is natural to develop a quantitative evaluation metric to act like an artist for accurately assessing the quality of AST images. Inspired by this, we present an artist-like network (AL-Net) which can analyze the quality of the stylized images like an artist from the fine knowledge of artistic painting (e.g., aesthetics, structure, color, texture). Specifically, the AL-Net consists of three sub-networks: an aesthetic prediction network (AP-Net), a content preservation prediction network (CPP-Net), and a style resemblance prediction network (SRP-Net), which can be regarded as specialized feature extractors, leveraging professional artistic painting knowledge through pre-training by different labels. To more effectively predict the final overall quality, we apply transfer learning to integrate the pre-trained feature vectors representing different painting elements into overall vision quality regression. The loss determined by the overall vision label fine-tunes the parameters of AL-Net, and thus our model can establish a tight connection with human perception. Extensive experiments on the AST-IQAD dataset validate that the proposed method achieves the state-of-the-art performance.
任意风格转移(AST)是一种将艺术风格转移到内容图像中的独特技术,其目标是生成接近真实艺术绘画的风格化图像。因此,很自然地,我们需要开发一种量化评估指标,像艺术家一样准确评估 AST 图像的质量。受此启发,我们提出了一种类似艺术家的网络(AL-Net),它可以像艺术家一样,从艺术绘画的细微知识(如美学、结构、色彩、纹理)出发,分析风格化图像的质量。具体来说,AL-Net 由三个子网络组成:美学预测网络(AP-Net)、内容保存预测网络(CPP-Net)和风格相似性预测网络(SRP-Net),它们可以被视为专门的特征提取器,通过不同标签的预训练利用专业的艺术绘画知识。为了更有效地预测最终的整体质量,我们应用迁移学习将代表不同绘画元素的预训练特征向量整合到整体视觉质量回归中。由整体视觉标签决定的损失会微调 AL-Net 的参数,因此我们的模型可以与人类感知建立紧密联系。在 AST-IQAD 数据集上进行的大量实验验证了所提出的方法达到了最先进的性能。
{"title":"Assessing arbitrary style transfer like an artist","authors":"Hangwei Chen,&nbsp;Feng Shao,&nbsp;Baoyang Mu,&nbsp;Qiuping Jiang","doi":"10.1016/j.displa.2024.102859","DOIUrl":"10.1016/j.displa.2024.102859","url":null,"abstract":"<div><div>Arbitrary style transfer (AST) is a distinctive technique for transferring artistic style into content images, with the goal of generating stylized images that approximates real artistic paintings. Thus, it is natural to develop a quantitative evaluation metric to act like an artist for accurately assessing the quality of AST images. Inspired by this, we present an artist-like network (AL-Net) which can analyze the quality of the stylized images like an artist from the fine knowledge of artistic painting (e.g., aesthetics, structure, color, texture). Specifically, the AL-Net consists of three sub-networks: an aesthetic prediction network (AP-Net), a content preservation prediction network (CPP-Net), and a style resemblance prediction network (SRP-Net), which can be regarded as specialized feature extractors, leveraging professional artistic painting knowledge through pre-training by different labels. To more effectively predict the final overall quality, we apply transfer learning to integrate the pre-trained feature vectors representing different painting elements into overall vision quality regression. The loss determined by the overall vision label fine-tunes the parameters of AL-Net, and thus our model can establish a tight connection with human perception. Extensive experiments on the AST-IQAD dataset validate that the proposed method achieves the state-of-the-art performance.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102859"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The role of image realism and expectation in illusory self-motion (vection) perception in younger and older adults 图像逼真度和期望值在年轻人和老年人的虚幻自我运动(向量)感知中的作用
IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2024-10-28 DOI: 10.1016/j.displa.2024.102868
Brandy Murovec , Julia Spaniol , Behrang Keshavarz
Research on the illusion of self-motion (vection) has primarily focused on younger adults, with few studies including older adults. In light of documented age differences in bottom-up and top-down perception and attention, the current study examined the impact of stimulus properties (speed), cognitive factors (expectancy), and a combination of both (stimulus realism) on vection in younger (18–35 years) and older (65+ years) adults. Participants were led to believe through manipulation of the study instructions that they were either likely or unlikely to experience vection before they were exposed to a rotating visual stimulus aimed to induce circular vection. Realism was manipulated by disrupting the global consistency of the visual stimulus comprised of an intact 360° panoramic photograph, resulting in two images (intact, scrambled). The speed of the stimulus was varied (faster, slower). Vection was measured using self-ratings of onset latency, duration, and intensity. Results showed that intact images produced more vection than scrambled images, especially at faster speeds. In contrast, expectation did not significantly impact vection. Overall, these patterns were similar across both age groups, although younger adults reported earlier vection onsets than older adults at faster speeds. These findings suggest that vection results from an interplay of stimulus-driven and cognitive factors in both younger and older adults.
关于自我运动错觉(vection)的研究主要集中在年轻人身上,很少有研究涉及老年人。鉴于在自下而上和自上而下的感知和注意力方面存在年龄差异,本研究考察了刺激属性(速度)、认知因素(期望值)以及二者的结合(刺激真实性)对年轻(18-35 岁)和年长(65 岁以上)成年人的自我运动错觉的影响。通过操纵研究说明,让参与者相信,在他们接触旨在诱发环状牵引的旋转视觉刺激之前,他们有可能或不可能经历牵引。通过破坏视觉刺激物(由一张完整的 360° 全景照片组成)的整体一致性来操纵逼真度,从而产生两幅图像(完整的、乱码的)。刺激的速度是变化的(较快、较慢)。通过对起始潜伏期、持续时间和强度进行自我评分来测量视觉效果。结果显示,完整图像比乱码图像产生更多的牵引力,尤其是在速度较快的情况下。相比之下,期望值对牵引力的影响不大。总体而言,这些模式在两个年龄组中都相似,但在速度较快时,年轻成人比老年人更早出现牵引。这些研究结果表明,在年轻人和老年人中,静脉阻断是刺激驱动因素和认知因素相互作用的结果。
{"title":"The role of image realism and expectation in illusory self-motion (vection) perception in younger and older adults","authors":"Brandy Murovec ,&nbsp;Julia Spaniol ,&nbsp;Behrang Keshavarz","doi":"10.1016/j.displa.2024.102868","DOIUrl":"10.1016/j.displa.2024.102868","url":null,"abstract":"<div><div>Research on the illusion of self-motion (vection) has primarily focused on younger adults, with few studies including older adults. In light of documented age differences in bottom-up and top-down perception and attention, the current study examined the impact of stimulus properties (speed), cognitive factors (expectancy), and a combination of both (stimulus realism) on vection in younger (18–35 years) and older (65+ years) adults. Participants were led to believe through manipulation of the study instructions that they were either likely or unlikely to experience vection before they were exposed to a rotating visual stimulus aimed to induce circular vection. Realism was manipulated by disrupting the global consistency of the visual stimulus comprised of an intact 360° panoramic photograph, resulting in two images (intact, scrambled). The speed of the stimulus was varied (faster, slower). Vection was measured using self-ratings of onset latency, duration, and intensity. Results showed that intact images produced more vection than scrambled images, especially at faster speeds. In contrast, expectation did not significantly impact vection. Overall, these patterns were similar across both age groups, although younger adults reported earlier vection onsets than older adults at faster speeds. These findings suggest that vection results from an interplay of stimulus-driven and cognitive factors in both younger and older adults.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"85 ","pages":"Article 102868"},"PeriodicalIF":3.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142579051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Displays
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1