首页 > 最新文献

Displays最新文献

英文 中文
PGgraf: Pose-Guided generative radiance field for novel-views on X-ray PGgraf:姿态引导生成辐射场在x射线上的新观点
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-17 DOI: 10.1016/j.displa.2026.103354
Hangyu Li , Moquan Liu , Nan Wang, Mengcheng Sun, Yu Zhu
In clinical diagnosis, doctors usually judge the information by a few X-rays to avoid excessive ionizing radiation from harming the patient. The recent Neural Radiance Field (NERF) technology contemplates generating novel-views from a single X-ray to assist physicians in diagnosis. In this task, we consider two advantages of X-ray filming over natural images: (1) The medical equipment is fixed, and there is a standardized filming pose. (2) There is an apparent structural prior to X-rays of the same body part at the same pose. Based on such conditions, we propose a Pose-Guided generative radiance field (PGgraf) containing a generator and discriminator. In the training phase, the discriminator combines the image features with two kinds of pose information (ray direction set and camera angle) to guide the generator to synthesize X-rays consistent with the realistic view. In the generator, we design a Density Reconstruction Block (DRB). Unlike the original NERF, which directly estimates the particle density based on the particle positions, the DRB considers all the particle features sampled in a ray and integrally predicts the density of each particle. Experiments comparing qualitative–quantitative on two chest datasets and one knee dataset with state-of-the-art NERF schemes show that PGgraf has a clear advantage in inferring novel-views at different ranges. In the three ranges of 0°to 360°, −15°to 15°, and 75°to 105°, the Peak Signal-to-Noise Ratio (PSNR) improved by an average of 4.18 decibel, and the Learned Perceptual Image Patch Similarity (LPIPS) improved by an average of 50.7%.
在临床诊断中,医生通常通过少量的x射线来判断信息,以避免过多的电离辐射对患者造成伤害。最近的神经辐射场(NERF)技术考虑从单个x射线产生新的视图,以帮助医生诊断。在本任务中,我们考虑了x射线拍摄相对于自然图像的两个优点:(1)医疗设备是固定的,并且有一个标准化的拍摄姿势。(2)在同一姿势下,同一身体部位的x光前有明显的结构。基于这些条件,我们提出了一种包含生成器和鉴别器的姿态引导生成辐射场(PGgraf)。在训练阶段,鉴别器将图像特征与两种姿态信息(射线方向集和相机角度)结合起来,引导生成器合成符合真实视图的x射线。在生成器中,我们设计了一个密度重构块(DRB)。与原始NERF直接根据粒子位置估计粒子密度不同,DRB考虑了在射线中采样的所有粒子特征,并整体预测每个粒子的密度。在两个胸部数据集和一个膝盖数据集的定性定量实验中,与最先进的NERF方案进行了比较,结果表明PGgraf在推断不同范围内的新视图方面具有明显的优势。在0°~ 360°、- 15°~ 15°和75°~ 105°三个范围内,峰值信噪比(PSNR)平均提高4.18分贝,学习感知图像斑块相似度(LPIPS)平均提高50.7%。
{"title":"PGgraf: Pose-Guided generative radiance field for novel-views on X-ray","authors":"Hangyu Li ,&nbsp;Moquan Liu ,&nbsp;Nan Wang,&nbsp;Mengcheng Sun,&nbsp;Yu Zhu","doi":"10.1016/j.displa.2026.103354","DOIUrl":"10.1016/j.displa.2026.103354","url":null,"abstract":"<div><div>In clinical diagnosis, doctors usually judge the information by a few X-rays to avoid excessive ionizing radiation from harming the patient. The recent Neural Radiance Field (NERF) technology contemplates generating novel-views from a single X-ray to assist physicians in diagnosis. In this task, we consider two advantages of X-ray filming over natural images: (1) The medical equipment is fixed, and there is a standardized filming pose. (2) There is an apparent structural prior to X-rays of the same body part at the same pose. Based on such conditions, we propose a Pose-Guided generative radiance field (PGgraf) containing a generator and discriminator. In the training phase, the discriminator combines the image features with two kinds of pose information (ray direction set and camera angle) to guide the generator to synthesize X-rays consistent with the realistic view. In the generator, we design a Density Reconstruction Block (DRB). Unlike the original NERF, which directly estimates the particle density based on the particle positions, the DRB considers all the particle features sampled in a ray and integrally predicts the density of each particle. Experiments comparing qualitative–quantitative on two chest datasets and one knee dataset with state-of-the-art NERF schemes show that PGgraf has a clear advantage in inferring novel-views at different ranges. In the three ranges of 0°to 360°, −15°to 15°, and 75°to 105°, the Peak Signal-to-Noise Ratio (PSNR) improved by an average of 4.18 decibel, and the Learned Perceptual Image Patch Similarity (LPIPS) improved by an average of 50.7%.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103354"},"PeriodicalIF":3.4,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146037272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new iterative inverse display model 一种新的迭代逆显示模型
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-16 DOI: 10.1016/j.displa.2026.103342
María José Pérez-Peñalver , S.-W. Lee , Cristina Jordán , Esther Sanabria-Codesal , Samuel Morillas
In this paper, we propose a new inverse model for display characterization based on the direct model developed in Kim and Lee (2015). We use an iterative method to compute what inputs are able to produce a desired color expressed in device independent color coordinates. Whereas iterative approaches have been used in the past for this task, the main novelty in our proposal is the use of specific heuristics based on the former display model and color science principles to achieve an efficient and accurate convergence. On the one hand, to set the initial point of the iterative process, we use orthogonal projections of the desired color chromaticity, xy, onto the display’s chromaticity triangle to find the initial ratio the RGB coordinates need to have. Subsequently, we use a factor product, preserving RGB proportions, to initially approximate the desired color’s luminance. This factor is obtained through a nonlinear modeling of the relation between RGB and luminance. On the other hand, to reduce the number of iterations needed, we use the direct model mentioned above: to set the RGB values of the next iteration we look at the differences between color prediction provided by the direct model for the current RGB values and desired color coordinates but looking separately at chromaticity and luminance following the same reasoning as for the initial point. As we will see from the experimental results, the method is accurate, efficient and robust. With respect to state of the art, method performance is specially good for low quality displays where physical assumptions made by other models do not hold completely.
在本文中,我们基于Kim和Lee(2015)开发的直接模型提出了一种新的显示表征逆模型。我们使用迭代方法来计算哪些输入能够产生以设备无关的颜色坐标表示的所需颜色。虽然迭代方法在过去已经被用于这项任务,但我们的提议的主要新颖之处在于使用基于前显示模型和颜色科学原理的特定启发式方法来实现高效和准确的收敛。一方面,为了设置迭代过程的初始点,我们使用所需颜色色度xy的正交投影到显示器的色度三角形上,以找到RGB坐标需要具有的初始比例。随后,我们使用因子乘积,保留RGB比例,以初步近似所需颜色的亮度。该因子是通过对RGB和亮度之间的关系进行非线性建模得到的。另一方面,为了减少所需的迭代次数,我们使用上面提到的直接模型:为了设置下一次迭代的RGB值,我们查看直接模型为当前RGB值和期望的颜色坐标提供的颜色预测之间的差异,但按照与初始点相同的推理分别查看色度和亮度。实验结果表明,该方法准确、高效、鲁棒性好。就目前的技术水平而言,在其他模型所做的物理假设不完全成立的低质量显示中,方法性能特别好。
{"title":"A new iterative inverse display model","authors":"María José Pérez-Peñalver ,&nbsp;S.-W. Lee ,&nbsp;Cristina Jordán ,&nbsp;Esther Sanabria-Codesal ,&nbsp;Samuel Morillas","doi":"10.1016/j.displa.2026.103342","DOIUrl":"10.1016/j.displa.2026.103342","url":null,"abstract":"<div><div>In this paper, we propose a new inverse model for display characterization based on the direct model developed in Kim and Lee (2015). We use an iterative method to compute what inputs are able to produce a desired color expressed in device independent color coordinates. Whereas iterative approaches have been used in the past for this task, the main novelty in our proposal is the use of specific heuristics based on the former display model and color science principles to achieve an efficient and accurate convergence. On the one hand, to set the initial point of the iterative process, we use orthogonal projections of the desired color chromaticity, <span><math><mrow><mi>x</mi><mi>y</mi></mrow></math></span>, onto the display’s chromaticity triangle to find the initial ratio the RGB coordinates need to have. Subsequently, we use a factor product, preserving RGB proportions, to initially approximate the desired color’s luminance. This factor is obtained through a nonlinear modeling of the relation between RGB and luminance. On the other hand, to reduce the number of iterations needed, we use the direct model mentioned above: to set the RGB values of the next iteration we look at the differences between color prediction provided by the direct model for the current RGB values and desired color coordinates but looking separately at chromaticity and luminance following the same reasoning as for the initial point. As we will see from the experimental results, the method is accurate, efficient and robust. With respect to state of the art, method performance is specially good for low quality displays where physical assumptions made by other models do not hold completely.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103342"},"PeriodicalIF":3.4,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146077314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient road marking extraction via cooperative enhancement of foundation models and Mamba 通过基础模型和曼巴的协同增强,高效提取道路标线
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-14 DOI: 10.1016/j.displa.2026.103351
Kang Zheng, Fu Ren
Road marking extraction is critical for high-definition mapping and autonomous driving, yet most lightweight models overlook the long-tailed appearance of thin markings during real-time inference. We propose Efficient Road Markings Segmentation Network (ERMSNet), a hybrid network that pairs lightweight design with the expressive power of Mamba and foundation models. ERMSNet comprises three synergistic branches. (1) A wavelet-augmented Baseline embeds a Road-Marking Mamba (RM-Mamba) whose bi-directional vertical scan captures elongated structures with fewer parameters than vanilla Mamba. (2) A Feature Enhancement branch distills dense image embeddings from the frozen Segment Anything Model (SAM) foundation model through a depth-wise squeeze-and-excitation adapter, injecting rich spatial detail at negligible cost. (3) An Attention Focusing branch projects text–image similarities produced by the Contrastive Language-Image Pre-training (CLIP) foundation model as soft masks that steer the decoder toward rare classes. Comprehensive experiments on CamVid, and our newly released Wuhan Road-Marking (WHRM) benchmark verify the design. Experimental results demonstrate that ERMSNet, with a lightweight configuration of only 0.99 million parameters and 6.44 GFLOPs, achieves mIoU scores of 79.85% and 81.18%, respectively. Compared with existing state-of-the-art methods, ERMSNet significantly reduces computational and memory costs while still delivering outstanding segmentation performance. Its superiority is especially evident in extracting thin and infrequently occurring road marking, highlighting its strong ability to balance efficiency and accuracy. Code and the WHRM dataset will be released upon publication.
道路标记提取对于高清地图和自动驾驶至关重要,但大多数轻量化模型在实时推断过程中忽略了细标记的长尾外观。我们提出了高效道路标记分割网络(ERMSNet),这是一个混合网络,将轻量级设计与Mamba和基础模型的表现力相结合。ERMSNet包括三个协同的分支机构。(1)小波增强基线嵌入道路标记曼巴(RM-Mamba),其双向垂直扫描捕获比香草曼巴参数更少的细长结构。(2) Feature Enhancement分支通过深度压缩和激励适配器从冻结的SAM基础模型中提取密集的图像嵌入,以可忽略不计的成本注入丰富的空间细节。(3)注意聚焦分支将对比语言-图像预训练(CLIP)基础模型产生的文本-图像相似性投影为软掩模,引导解码器转向稀有类。在CamVid上的综合实验,以及我们新发布的武汉道路标线(WHRM)基准验证了该设计。实验结果表明,在只有99万个参数和644个GFLOPs的轻量级配置下,ERMSNet的mIoU分数分别达到了79.85%和81.18%。与现有的最先进的方法相比,ERMSNet显著降低了计算和内存成本,同时仍然提供出色的分割性能。其优势在提取单薄且不常出现的道路标线方面尤为明显,突出了其平衡效率和准确性的强大能力。代码和WHRM数据集将在出版后发布。
{"title":"Efficient road marking extraction via cooperative enhancement of foundation models and Mamba","authors":"Kang Zheng,&nbsp;Fu Ren","doi":"10.1016/j.displa.2026.103351","DOIUrl":"10.1016/j.displa.2026.103351","url":null,"abstract":"<div><div>Road marking extraction is critical for high-definition mapping and autonomous driving, yet most lightweight models overlook the long-tailed appearance of thin markings during real-time inference. We propose Efficient Road Markings Segmentation Network (ERMSNet), a hybrid network that pairs lightweight design with the expressive power of Mamba and foundation models. ERMSNet comprises three synergistic branches. (1) A wavelet-augmented Baseline embeds a Road-Marking Mamba (RM-Mamba) whose bi-directional vertical scan captures elongated structures with fewer parameters than vanilla Mamba. (2) A Feature Enhancement branch distills dense image embeddings from the frozen Segment Anything Model (SAM) foundation model through a depth-wise squeeze-and-excitation adapter, injecting rich spatial detail at negligible cost. (3) An Attention Focusing branch projects text–image similarities produced by the Contrastive Language-Image Pre-training (CLIP) foundation model as soft masks that steer the decoder toward rare classes. Comprehensive experiments on CamVid, and our newly released Wuhan Road-Marking (WHRM) benchmark verify the design. Experimental results demonstrate that ERMSNet, with a lightweight configuration of only 0.99 million parameters and 6.44 GFLOPs, achieves mIoU scores of 79.85% and 81.18%, respectively. Compared with existing state-of-the-art methods, ERMSNet significantly reduces computational and memory costs while still delivering outstanding segmentation performance. Its superiority is especially evident in extracting thin and infrequently occurring road marking, highlighting its strong ability to balance efficiency and accuracy. Code and the WHRM dataset will be released upon publication.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103351"},"PeriodicalIF":3.4,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146037277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TDOA based localization mechanism for the UAV positioning in dark and confined environments 基于TDOA的无人机黑暗受限环境定位机制
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-14 DOI: 10.1016/j.displa.2026.103346
Haobin Shi , Quantao Wang , Zihan Wang , Jianning Zhan , Huijian Liang , Beiya Yang
With the growing demand for autonomous inspection with Unmanned Aerial Vehicles (UAVs) in dark and confined environments, accurately determining UAV position has become crucial. The Ultra-Wideband (UWB) localization technology offers a promising solution by overcoming challenges posed by signal obstruction, low illumination condition, and confined spaces. However, conventional UWB-based positioning suffers from performance oscillations due to measurement inconsistencies and degradations with time-varying noise models. Furthermore, the widely used Two-Way Time-of-Flight (TW-TOF) method has limitations, such as high energy consumption and a restricted number of tags to be deployed. To address these, a sensor fusion approach combining UWB and Inertial Measurement Unit (IMU) measurements with Time Difference of Arrival (TDOA) localization mechanism is proposed. This method exploits an adaptive Kalman filter, which dynamically adjusts to noise model variations and employs individual weighting factors for each anchor node, enhancing stability and robustness in challenging environments. The comprehensive experiments demonstrate the proposed algorithm achieves a median positioning error of 0.110 m, a 90th percentile error of 0.232 m, and an average standard deviation of 0.075 m with the significantly reduced energy consumption. Additionally, due to TDOA communication principles, this method supports multiple tag nodes, making it ideal for multi-UAV collaborative inspections in future applications.
随着无人机在黑暗和密闭环境中自主检测的需求日益增长,准确确定无人机的位置变得至关重要。超宽带(UWB)定位技术克服了信号障碍、低光照条件和受限空间等挑战,是一种很有前途的解决方案。然而,传统的基于uwb的定位由于测量不一致和时变噪声模型的退化而受到性能振荡的影响。此外,广泛使用的双向飞行时间(TW-TOF)方法存在一些局限性,例如高能耗和可部署的标签数量有限。为了解决这些问题,提出了一种结合超宽带和惯性测量单元(IMU)测量和到达时间差(TDOA)定位机制的传感器融合方法。该方法利用自适应卡尔曼滤波器,动态调整噪声模型的变化,并为每个锚节点采用单独的加权因子,增强了在挑战性环境中的稳定性和鲁棒性。综合实验表明,该算法的定位中位数误差为0.110 m,第90百分位误差为0.232 m,平均标准差为0.075 m,能耗显著降低。此外,由于TDOA通信原理,该方法支持多个标签节点,使其成为未来应用中多无人机协同检测的理想选择。
{"title":"TDOA based localization mechanism for the UAV positioning in dark and confined environments","authors":"Haobin Shi ,&nbsp;Quantao Wang ,&nbsp;Zihan Wang ,&nbsp;Jianning Zhan ,&nbsp;Huijian Liang ,&nbsp;Beiya Yang","doi":"10.1016/j.displa.2026.103346","DOIUrl":"10.1016/j.displa.2026.103346","url":null,"abstract":"<div><div>With the growing demand for autonomous inspection with Unmanned Aerial Vehicles (UAVs) in dark and confined environments, accurately determining UAV position has become crucial. The Ultra-Wideband (UWB) localization technology offers a promising solution by overcoming challenges posed by signal obstruction, low illumination condition, and confined spaces. However, conventional UWB-based positioning suffers from performance oscillations due to measurement inconsistencies and degradations with time-varying noise models. Furthermore, the widely used Two-Way Time-of-Flight (TW-TOF) method has limitations, such as high energy consumption and a restricted number of tags to be deployed. To address these, a sensor fusion approach combining UWB and Inertial Measurement Unit (IMU) measurements with Time Difference of Arrival (TDOA) localization mechanism is proposed. This method exploits an adaptive Kalman filter, which dynamically adjusts to noise model variations and employs individual weighting factors for each anchor node, enhancing stability and robustness in challenging environments. The comprehensive experiments demonstrate the proposed algorithm achieves a median positioning error of 0.110 m, a 90th percentile error of 0.232 m, and an average standard deviation of 0.075 m with the significantly reduced energy consumption. Additionally, due to TDOA communication principles, this method supports multiple tag nodes, making it ideal for multi-UAV collaborative inspections in future applications.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103346"},"PeriodicalIF":3.4,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rethinking low-light image enhancement: A local–global synergy perspective 重新思考低光图像增强:局部-全局协同视角
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-13 DOI: 10.1016/j.displa.2026.103348
Qinghua Lin , Yu Long , Xudong Xiong , Wenchao Jiang , Zhihua Wang , Qiuping Jiang
Low-light image enhancement (LLIE) remains a challenging task due to the complex degradations in illumination, contrast, and structural details. Deep neural network-based approaches have shown promising results in addressing LLIE. However, most existing methods either utilize convolutional layers with local receptive fields, which are well-suited for restoring local textures, or Transformer layers with long-range dependencies, which are better at correcting global illumination. Despite their respective strengths, these approaches often struggle to effectively handle both aspects simultaneously. In this paper, we revisit LLIE from a local–global synergy perspective and propose a unified framework, the Local–Global Synergy Network (LGS-Net). LGS-Net explicitly extracts local and global features in parallel using a separable CNN and a Swin Transformer block, respectively, effectively modeling both local structural fidelity and global illumination balance. The extracted features are then fed into a squeeze-and-excitation-based fusion module, which adaptively integrates multi-scale information guided by perceptual relevance. Extensive experiments on multiple real-world benchmarks show that our method consistently outperforms existing state-of-the-art methods across both quantitative metrics (e.g., PSNR, SSIM, Q-Align) and perceptual quality, with notable improvements in color fidelity and detail preservation under extreme low-light and non-uniform illumination.
低光图像增强(LLIE)仍然是一个具有挑战性的任务,由于在照明,对比度和结构细节的复杂退化。基于深度神经网络的方法在解决LLIE方面显示出有希望的结果。然而,大多数现有方法要么使用具有局部接受域的卷积层,这非常适合恢复局部纹理,要么使用具有远程依赖关系的Transformer层,这更适合校正全局光照。尽管这些方法各自具有优势,但它们往往难以同时有效地处理这两个方面。在本文中,我们从本地-全球协同的角度重新审视LLIE,并提出了一个统一的框架,即本地-全球协同网络(LGS-Net)。LGS-Net分别使用可分离的CNN和Swin Transformer块明确地并行提取局部和全局特征,有效地模拟了局部结构保真度和全局照明平衡。然后将提取的特征输入到基于挤压和兴奋的融合模块中,该模块以感知相关性为导向自适应集成多尺度信息。在多个现实世界基准上的广泛实验表明,我们的方法在定量指标(例如,PSNR, SSIM, Q-Align)和感知质量方面始终优于现有的最先进的方法,在极低光和不均匀照明下的色彩保真度和细节保存方面有显着改善。
{"title":"Rethinking low-light image enhancement: A local–global synergy perspective","authors":"Qinghua Lin ,&nbsp;Yu Long ,&nbsp;Xudong Xiong ,&nbsp;Wenchao Jiang ,&nbsp;Zhihua Wang ,&nbsp;Qiuping Jiang","doi":"10.1016/j.displa.2026.103348","DOIUrl":"10.1016/j.displa.2026.103348","url":null,"abstract":"<div><div>Low-light image enhancement (LLIE) remains a challenging task due to the complex degradations in illumination, contrast, and structural details. Deep neural network-based approaches have shown promising results in addressing LLIE. However, most existing methods either utilize convolutional layers with local receptive fields, which are well-suited for restoring local textures, or Transformer layers with long-range dependencies, which are better at correcting global illumination. Despite their respective strengths, these approaches often struggle to effectively handle both aspects simultaneously. In this paper, we revisit LLIE from a local–global synergy perspective and propose a unified framework, the Local–Global Synergy Network (LGS-Net). LGS-Net explicitly extracts local and global features in parallel using a separable CNN and a Swin Transformer block, respectively, effectively modeling both local structural fidelity and global illumination balance. The extracted features are then fed into a squeeze-and-excitation-based fusion module, which adaptively integrates multi-scale information guided by perceptual relevance. Extensive experiments on multiple real-world benchmarks show that our method consistently outperforms existing state-of-the-art methods across both quantitative metrics (e.g., PSNR, SSIM, Q-Align) and perceptual quality, with notable improvements in color fidelity and detail preservation under extreme low-light and non-uniform illumination.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103348"},"PeriodicalIF":3.4,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Endo-E2E-GS: End-to-end 3D reconstruction of endoscopic scenes using Gaussian Splatting endoe - e2e - gs:使用高斯飞溅对内镜场景进行端到端三维重建
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-13 DOI: 10.1016/j.displa.2026.103353
Xiongzhi Wang , Boyu Yang , Min Wei , Yu Chen , Jingang Zhang , Yunfeng Nie
Three-dimensional (3D) reconstruction is essential for enhancing spatial perception and geometric understanding in minimally invasive surgery. However, current methods like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) often rely on offline preprocessing—such as COLMAP-based point clouds or multi-frame fusion—limiting their adaptability and clinical deployment. We propose Endo-E2E-GS, a fully end-to-end framework that reconstructs structured 3D Gaussian fields directly from a single stereo endoscopic image pair. The system integrates (1) a DilatedResNet-based stereo depth estimator for robust geometry inference in low-texture scenes, (2) a Gaussian attribute predictor that infers per-pixel rotation, scale, and opacity, and (3) a differentiable splatting renderer for 2D view supervision. Evaluated on the ENDONERF and SCARED datasets, Endo-E2E-GS achieves highly competitive performance, reaching PSNR values of 38.874/33.052 and SSIM scores of 0.978/0.863, respectively, surpassing recent state-of-the-art approaches. It requires no explicit scene initialization and demonstrates consistent performance across two representative endoscopic datasets. Code is available at: https://github.com/Intelligent-Imaging-Center/Endo-E2E-GS.
在微创手术中,三维重建对于增强空间感知和几何理解至关重要。然而,目前的方法,如神经辐射场(NeRF)和3D高斯飞溅(3DGS)通常依赖于离线预处理,如基于colmap的点云或多帧融合,限制了它们的适应性和临床部署。我们提出了Endo-E2E-GS,这是一个完全端到端的框架,可以直接从单个立体内窥镜图像对重建结构化的3D高斯场。该系统集成了(1)一个基于dilatedresnet的立体深度估计器,用于在低纹理场景中进行鲁棒的几何推断;(2)一个高斯属性预测器,用于推断每像素的旋转、比例和不透明度;(3)一个可微分的飞溅渲染器,用于2D视图监督。在ENDONERF和SCARED数据集上进行评估,Endo-E2E-GS具有很强的竞争力,PSNR值分别达到38.874/33.052,SSIM得分分别达到0.978/0.863,超过了目前最先进的方法。它不需要明确的场景初始化,并在两个代表性的内窥镜数据集上展示一致的性能。代码可从https://github.com/Intelligent-Imaging-Center/Endo-E2E-GS获得。
{"title":"Endo-E2E-GS: End-to-end 3D reconstruction of endoscopic scenes using Gaussian Splatting","authors":"Xiongzhi Wang ,&nbsp;Boyu Yang ,&nbsp;Min Wei ,&nbsp;Yu Chen ,&nbsp;Jingang Zhang ,&nbsp;Yunfeng Nie","doi":"10.1016/j.displa.2026.103353","DOIUrl":"10.1016/j.displa.2026.103353","url":null,"abstract":"<div><div>Three-dimensional (3D) reconstruction is essential for enhancing spatial perception and geometric understanding in minimally invasive surgery. However, current methods like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) often rely on offline preprocessing—such as COLMAP-based point clouds or multi-frame fusion—limiting their adaptability and clinical deployment. We propose Endo-E2E-GS, a fully end-to-end framework that reconstructs structured 3D Gaussian fields directly from a single stereo endoscopic image pair. The system integrates (1) a DilatedResNet-based stereo depth estimator for robust geometry inference in low-texture scenes, (2) a Gaussian attribute predictor that infers per-pixel rotation, scale, and opacity, and (3) a differentiable splatting renderer for 2D view supervision. Evaluated on the ENDONERF and SCARED datasets, Endo-E2E-GS achieves highly competitive performance, reaching PSNR values of 38.874/33.052 and SSIM scores of 0.978/0.863, respectively, surpassing recent state-of-the-art approaches. It requires no explicit scene initialization and demonstrates consistent performance across two representative endoscopic datasets. Code is available at: <span><span><strong>https://github.com/Intelligent-Imaging-Center/Endo-E2E-GS</strong></span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103353"},"PeriodicalIF":3.4,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Differences in streaming quality impact viewer expectations, attitudes and reactions to video 流媒体质量的差异会影响观众对视频的期望、态度和反应
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-12 DOI: 10.1016/j.displa.2026.103350
Christopher A. Sanchez, Nisha Raghunath, Chelsea Ahart
Given the massive amount of visual media consumed across the world everyday, an open question is whether deviations from high-quality streaming can negatively impact viewer’s opinions and attitudes towards viewed content? Previous research has shown that reductions in perceptual quality can negatively impact attitudes in other contexts. These changes in quality often lead to corresponding changes in attitudes. Are users sensitive to changes in video quality, and does this impact reactions to viewed content? For example, do users enjoy lower quality videos as much as higher-quality versions? Do quality differences also make viewers less receptive to the content of videos? Across two studies, participants watched a video in lower- or higher-quality, and were then queried regarding their viewing experience. This included ratings of attitudes towards video streaming and video content, and also included measures of factual recall. Results indicated that viewers significantly prefer videos presented in higher quality, which drives future viewing intentions. Further, while factual memory for information was equivalent across video quality, participants who viewed the higher-quality video were more likely to show an affective reaction to the video, and also change their attitudes relative to the presented content. These results have implications for the design and delivery of online video content, and suggests that any deviations from higher-quality presentations can bias opinions relative to the viewed content. Lower-quality videos decreased attitudes towards content, and also negatively impacted viewers’ receptiveness to presented content.
鉴于世界各地每天消费的大量视觉媒体,一个悬而未决的问题是,偏离高质量的流媒体是否会对观众对所观看内容的意见和态度产生负面影响?先前的研究表明,感知质量的降低会对其他情况下的态度产生负面影响。这些品质的变化往往导致态度的相应变化。用户对视频质量的变化是否敏感?这是否会影响用户对观看内容的反应?例如,用户是否像喜欢高质量视频一样喜欢低质量视频?质量差异是否也会使观众对视频内容的接受度降低?在两项研究中,参与者观看了低质量或高质量的视频,然后询问他们的观看体验。这包括对视频流媒体和视频内容的态度评级,也包括对事实回忆的测量。结果表明,观众明显更喜欢高质量的视频,这推动了未来的观看意图。此外,虽然对信息的事实记忆在视频质量上是相同的,但观看高质量视频的参与者更有可能对视频表现出情感反应,并且也会相对于所呈现的内容改变他们的态度。这些结果对在线视频内容的设计和交付具有启示意义,并表明任何与高质量演示的偏差都可能使人们对所观看的内容产生偏见。低质量的视频降低了观众对内容的态度,也对观众对所呈现内容的接受程度产生了负面影响。
{"title":"Differences in streaming quality impact viewer expectations, attitudes and reactions to video","authors":"Christopher A. Sanchez,&nbsp;Nisha Raghunath,&nbsp;Chelsea Ahart","doi":"10.1016/j.displa.2026.103350","DOIUrl":"10.1016/j.displa.2026.103350","url":null,"abstract":"<div><div>Given the massive amount of visual media consumed across the world everyday, an open question is whether deviations from high-quality streaming can negatively impact viewer’s opinions and attitudes towards viewed content? Previous research has shown that reductions in perceptual quality can negatively impact attitudes in other contexts. These changes in quality often lead to corresponding changes in attitudes. Are users sensitive to changes in video quality, and does this impact reactions to viewed content? For example, do users enjoy lower quality videos as much as higher-quality versions? Do quality differences also make viewers less receptive to the content of videos? Across two studies, participants watched a video in lower- or higher-quality, and were then queried regarding their viewing experience. This included ratings of attitudes towards video streaming and video content, and also included measures of factual recall. Results indicated that viewers significantly prefer videos presented in higher quality, which drives future viewing intentions. Further, while factual memory for information was equivalent across video quality, participants who viewed the higher-quality video were more likely to show an affective reaction to the video, and also change their attitudes relative to the presented content. These results have implications for the design and delivery of online video content, and suggests that any deviations from higher-quality presentations can bias opinions relative to the viewed content. Lower-quality videos decreased attitudes towards content, and also negatively impacted viewers’ receptiveness to presented content.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103350"},"PeriodicalIF":3.4,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards LiDAR point cloud geometry compression using rate-distortion optimization and adaptive quantization for human-machine vision 基于率失真优化和自适应量化的人机视觉激光雷达点云几何压缩
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-11 DOI: 10.1016/j.displa.2026.103344
Yihan Wang , Yongfang Wang , Shuo Zhu , Zhijun Fang
Due to rapid advances in 3-Dimensional (3D) sensing and rendering technologies, point clouds have become increasingly widespread, bring significant challenges for transmission and storage. Existing LiDAR Point Cloud Compression (PCC) methods primarily focus on enhancing compression efficiency and maintaining high signal fidelity, with insufficient considering human and machine joint perception. This paper proposes Rate Distortion Optimization (RDO) and Adaptive Quantization (AQ) for LiDAR Point Cloud Geometry Compression (PCGC) to balance human–machine vision performance. Specifically, we first propose Hybrid Distortion RDO (HDRDO) using hybrid distortion and Lagrange multiplier, where the optimal weights are determined by Differential Evolution (DE) algorithm. Furthermore, by comprehensively analyzing the impacts of point clouds on a Gaussian-based classification method on overall quality, we propose a HDRDO-based AQ method to adaptively quantify important and non-important points by optimal Quantization Parameter (QP) selection. We implement on Geometry-based Point Cloud Compression (G-PCC) Test Model Category 1 and 3 (TMC13), called the anchor method. Compared with the anchor method, the proposed algorithm achieves consistent PSNR for human vision tasks and improves by 2.66% and 21.18% on accuracy at low bitrates for detection and segmentation, respectively. Notably, the proposed overall method performs better than the existing method.
由于三维(3D)传感和渲染技术的快速发展,点云变得越来越普遍,给传输和存储带来了重大挑战。现有的激光雷达点云压缩(PCC)方法主要侧重于提高压缩效率和保持高信号保真度,没有充分考虑人与机器的联合感知。本文提出了激光雷达点云几何压缩(PCGC)的速率失真优化(RDO)和自适应量化(AQ)来平衡人机视觉性能。具体来说,我们首先提出了混合失真RDO (HDRDO),使用混合失真和拉格朗日乘法器,其中最优权重由差分进化(DE)算法确定。在综合分析点云对高斯分类方法整体质量影响的基础上,提出了一种基于hdrdo的AQ方法,通过最优量化参数(QP)选择自适应量化重要点和非重要点。我们实现了基于几何的点云压缩(G-PCC)测试模型类别1和3 (TMC13),称为锚点方法。与锚点方法相比,该算法在人类视觉任务中实现了一致的PSNR,在低比特率下的检测和分割准确率分别提高了2.66%和21.18%。值得注意的是,本文提出的方法总体性能优于现有方法。
{"title":"Towards LiDAR point cloud geometry compression using rate-distortion optimization and adaptive quantization for human-machine vision","authors":"Yihan Wang ,&nbsp;Yongfang Wang ,&nbsp;Shuo Zhu ,&nbsp;Zhijun Fang","doi":"10.1016/j.displa.2026.103344","DOIUrl":"10.1016/j.displa.2026.103344","url":null,"abstract":"<div><div>Due to rapid advances in 3-Dimensional (3D) sensing and rendering technologies, point clouds have become increasingly widespread, bring significant challenges for transmission and storage. Existing LiDAR Point Cloud Compression (PCC) methods primarily focus on enhancing compression efficiency and maintaining high signal fidelity, with insufficient considering human and machine joint perception. This paper proposes Rate Distortion Optimization (RDO) and Adaptive Quantization (AQ) for LiDAR Point Cloud Geometry Compression (PCGC) to balance human–machine vision performance. Specifically, we first propose Hybrid Distortion RDO (HDRDO) using hybrid distortion and Lagrange multiplier, where the optimal weights are determined by Differential Evolution (DE) algorithm. Furthermore, by comprehensively analyzing the impacts of point clouds on a Gaussian-based classification method on overall quality, we propose a HDRDO-based AQ method to adaptively quantify important and non-important points by optimal Quantization Parameter (QP) selection. We implement on Geometry-based Point Cloud Compression (G-PCC) Test Model Category 1 and 3 (TMC13), called the anchor method. Compared with the anchor method, the proposed algorithm achieves consistent PSNR for human vision tasks and improves by 2.66% and 21.18% on accuracy at low bitrates for detection and segmentation, respectively. Notably, the proposed overall method performs better than the existing method.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103344"},"PeriodicalIF":3.4,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Direct LiDAR-supervised surface-aligned 3D Gaussian Splatting for high-fidelity digital twin 直接激光雷达监督表面对准三维高斯溅射高保真数字孪生
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-10 DOI: 10.1016/j.displa.2026.103349
Xingdong Sheng , Qi Zhou , Xu Liu , Zhenyang Qu , Haoyu Xu , Shijie Mao , Xiaokang Yang
3D Gaussian Splatting (3DGS) has recently demonstrated remarkable rendering speed and photorealistic quality for 3D reconstruction. Yet precise surface reconstruction and view-consistent photometric fidelity remain challenging, because the standard pipeline lacks explicit geometry supervision. Several recent approaches incorporate dense LiDAR point clouds as guidance, typically by aligning Gaussian centers or projecting LiDAR points into pseudo-depth maps. However, such methods constrain positions only and overlook the anisotropic shapes of the Gaussians, often resulting in rough surfaces and residual artifacts. To overcome these limitations, we propose a direct LiDAR-supervised surface-aligned regularization loss that simultaneously constrains Gaussian positions and shapes without converting LiDAR scans into depth maps. We further introduce adaptive densification and a multi-view depth-guided pruning strategy to enhance fidelity and suppress floaters. Extensive experiments on diverse indoor and outdoor datasets that represent the demands of industrial digital-twin applications show that our method consistently improves photorealistic rendering, even under significant viewpoint deviations, demonstrating advantages over existing typical LiDAR-assisted 3DGS methods.
3D高斯喷溅(3DGS)最近在3D重建中表现出了惊人的渲染速度和逼真的质量。然而,由于标准管道缺乏明确的几何形状监督,精确的表面重建和视场一致的光度保真度仍然具有挑战性。最近的几种方法将密集的激光雷达点云作为制导,通常是通过对齐高斯中心或将激光雷达点投影到伪深度图中。然而,这些方法只限制了位置,而忽略了高斯函数的各向异性形状,经常导致粗糙的表面和残留的伪影。为了克服这些限制,我们提出了一种直接激光雷达监督的表面对齐正则化损失,同时约束高斯位置和形状,而无需将激光雷达扫描转换为深度图。我们进一步引入自适应致密化和多视图深度引导修剪策略来提高保真度并抑制飞蚊。在代表工业数字孪生应用需求的各种室内和室外数据集上进行的大量实验表明,即使在显著的视点偏差下,我们的方法也能持续改善真实感渲染,显示出比现有典型的激光雷达辅助3DGS方法的优势。
{"title":"Direct LiDAR-supervised surface-aligned 3D Gaussian Splatting for high-fidelity digital twin","authors":"Xingdong Sheng ,&nbsp;Qi Zhou ,&nbsp;Xu Liu ,&nbsp;Zhenyang Qu ,&nbsp;Haoyu Xu ,&nbsp;Shijie Mao ,&nbsp;Xiaokang Yang","doi":"10.1016/j.displa.2026.103349","DOIUrl":"10.1016/j.displa.2026.103349","url":null,"abstract":"<div><div>3D Gaussian Splatting (3DGS) has recently demonstrated remarkable rendering speed and photorealistic quality for 3D reconstruction. Yet precise surface reconstruction and view-consistent photometric fidelity remain challenging, because the standard pipeline lacks explicit geometry supervision. Several recent approaches incorporate dense LiDAR point clouds as guidance, typically by aligning Gaussian centers or projecting LiDAR points into pseudo-depth maps. However, such methods constrain positions only and overlook the anisotropic shapes of the Gaussians, often resulting in rough surfaces and residual artifacts. To overcome these limitations, we propose a direct LiDAR-supervised surface-aligned regularization loss that simultaneously constrains Gaussian positions and shapes without converting LiDAR scans into depth maps. We further introduce adaptive densification and a multi-view depth-guided pruning strategy to enhance fidelity and suppress floaters. Extensive experiments on diverse indoor and outdoor datasets that represent the demands of industrial digital-twin applications show that our method consistently improves photorealistic rendering, even under significant viewpoint deviations, demonstrating advantages over existing typical LiDAR-assisted 3DGS methods.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103349"},"PeriodicalIF":3.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging the power of eye-tracking for virtual prototype evaluation: a comparison between virtual reality and photorealistic images 利用眼动追踪的力量进行虚拟原型评估:虚拟现实与逼真图像的比较
IF 3.4 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-01-10 DOI: 10.1016/j.displa.2026.103343
Almudena Palacios-Ibáñez , Manuel F. Contero-López , Santiago Castellet-Lathan , Nathan Hartman , Manuel Contero
Most of the information we gather from our environment is obtained from sight, hence, visual evaluation is vital for assessing products. However, designers have traditionally relied on self-report questionnaires for this purpose, which have proven to be insufficient in some cases. Consequently, physiological measures are being employed to gain a deeper understanding of the cognitive and perceptual processes involved in product evaluation, and, thanks to their integration in Virtual Reality (VR) headsets, they have become a powerful tool for virtual prototype assessment. Still, using virtual prototypes raises some concerns, as previous studies have found that the medium can influence product perception. These results rely solely on self-report techniques, highlighting the need to explore the use of ET for product assessment, which is the main objective of this research. We present two case studies where a group of people assessed through two display mediums (CS-1) a set of furniture comprising a general scene using a ranking-type evaluation (i.e., joint assessment) and (CS-2) two armchairs individually using the Semantic Differential technique. Moreover, the dwell time of the Areas of Interest (AOIs) defined was recorded. Primarily, our results showed that, despite VR being sensitive to aesthetic differences between designs of the same product typology, the medium may still influence the perception of specific product attributes —e.g., fragility (pMODERN < 0.001, pTRADITIONAL = 0.002)—, and observation of specific AOIs —e.g., AOI1 (pMODERN = 0.003, pTRADITIONAL < 0.001), AOI9 and AOI10 (p < 0.001). At the same time, no differences were found in the perception of the general scene, whereas dwell time was influenced for AOI1 (p = 0.003), AOI4 (p = 0.006), and AOI5 (<.001). Additionally, the university of origin may also be a factor influencing product evaluation, while confidence in the response was not affected by the medium. Hence, this study contributes to a deeper understanding of how the medium influences product perception by employing ET with self-report methods, offering valuable insights into user behavior.
我们从环境中收集的大部分信息都是通过视觉获得的,因此,视觉评估对于评估产品至关重要。然而,设计师传统上依靠自我报告问卷来达到这个目的,这在某些情况下被证明是不够的。因此,生理测量被用于更深入地了解产品评估中涉及的认知和感知过程,并且由于它们集成在虚拟现实(VR)头显中,它们已成为虚拟原型评估的强大工具。然而,使用虚拟原型引起了一些担忧,因为之前的研究发现,这种媒介会影响产品的感知。这些结果完全依赖于自我报告技术,突出了探索使用ET进行产品评估的必要性,这是本研究的主要目标。我们提出了两个案例研究,其中一组人通过两种显示媒介(CS-1)评估一套家具,包括使用排名类型评估(即联合评估)的一般场景,(CS-2)使用语义差异技术分别评估两把扶手椅。此外,还记录了所定义的兴趣区域(aoi)的停留时间。首先,我们的研究结果表明,尽管VR对相同产品类型的设计之间的美学差异很敏感,但媒介仍然可能影响对特定产品属性的感知。,脆弱性(pMODERN < 0.001, pTRADITIONAL = 0.002) -和特定aoi的观察-例如;, AOI1 (pMODERN = 0.003, pTRADITIONAL < 0.001), AOI9和AOI10 (p < 0.001)。同时,AOI1 (p = 0.003)、AOI4 (p = 0.006)和AOI5 (< 001)的停留时间受到AOI1 (p = 0.003)、AOI4 (p = 0.006)和AOI5的影响。此外,原产大学也可能是影响产品评价的一个因素,而对反应的信心不受媒介的影响。因此,本研究通过使用ET和自我报告方法,有助于更深入地了解媒体如何影响产品感知,为用户行为提供有价值的见解。
{"title":"Leveraging the power of eye-tracking for virtual prototype evaluation: a comparison between virtual reality and photorealistic images","authors":"Almudena Palacios-Ibáñez ,&nbsp;Manuel F. Contero-López ,&nbsp;Santiago Castellet-Lathan ,&nbsp;Nathan Hartman ,&nbsp;Manuel Contero","doi":"10.1016/j.displa.2026.103343","DOIUrl":"10.1016/j.displa.2026.103343","url":null,"abstract":"<div><div>Most of the information we gather from our environment is obtained from sight, hence, visual evaluation is vital for assessing products. However, designers have traditionally relied on self-report questionnaires for this purpose, which have proven to be insufficient in some cases. Consequently, physiological measures are being employed to gain a deeper understanding of the cognitive and perceptual processes involved in product evaluation, and, thanks to their integration in Virtual Reality (VR) headsets, they have become a powerful tool for virtual prototype assessment. Still, using virtual prototypes raises some concerns, as previous studies have found that the medium can influence product perception. These results rely solely on self-report techniques, highlighting the need to explore the use of ET for product assessment, which is the main objective of this research. We present two case studies where a group of people assessed through two display mediums (CS-1) a set of furniture comprising a general scene using a ranking-type evaluation (i.e., joint assessment) and (CS-2) two armchairs individually using the Semantic Differential technique. Moreover, the dwell time of the Areas of Interest (AOIs) defined was recorded. Primarily, our results showed that, despite VR being sensitive to aesthetic differences between designs of the same product typology, the medium may still influence the perception of specific product attributes —e.g., fragility (p<sub>MODERN</sub> &lt; 0.001, p<sub>TRADITIONAL</sub> = 0.002)—, and observation of specific AOIs —e.g., AOI1 (p<sub>MODERN</sub> = 0.003, p<sub>TRADITIONAL</sub> &lt; 0.001), AOI9 and AOI10 (p &lt; 0.001). At the same time, no differences were found in the perception of the general scene, whereas dwell time was influenced for AOI1 (p = 0.003), AOI4 (p = 0.006), and AOI5 (&lt;.001). Additionally, the university of origin may also be a factor influencing product evaluation, while confidence in the response was not affected by the medium. Hence, this study contributes to a deeper understanding of how the medium influences product perception by employing ET with self-report methods, offering valuable insights into user behavior.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"92 ","pages":"Article 103343"},"PeriodicalIF":3.4,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Displays
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1