Displays最新文献

MWPRFN: Multilevel Wavelet Pyramid Recurrent Fusion Network for underwater image enhancement

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-04-02 DOI: 10.1016/j.displa.2025.103050

Jinzhang Li, Jue Wang, Bo Li, Hangfan Gu

Underwater images often suffer from color distortion, blurry details, and low contrast due to light scattering and water-type changes. Existing methods mainly focus on spatial information and ignore frequency-difference processing, which hinders the solution to the mixing degradation problem. To overcome these challenges, we propose a multi-scale wavelet pyramid recurrent fusion network (MWPRFN). This network retains low-frequency features at all levels, integrates them into a low-frequency enhancement branch, and fuses image features using a multi-scale dynamic cross-layer mechanism (DCLM) to capture the correlation between high and low frequencies. Each stage of the multi-level framework consists of a multi-frequency information interaction pyramid network (MFIPN) and an atmospheric light compensation estimation network (ALCEN). The low-frequency branch of the MFIPN enhances global details through an efficient context refinement module (ECRM). In contrast, the high-frequency branch extracts texture and edge features through a multi-scale difference expansion module (MSDC). After the inverse wavelet transform, ALCEN uses atmospheric light estimation and frequency domain compensation to compensate for color distortion. Experimental results show that MWPRFN significantly improves the quality of underwater images on five benchmark datasets. Compared with state-of-the-art methods, objective image quality metrics including PSNR, SSIM, and NIQE are improved by an average of 3.45%, 1.32%, and 4.50% respectively. Specifically, PSNR increased from 24.03 decibels to 24.86 decibels, SSIM increased from 0.9002 to 0.9121, and NIQE decreased from 3.261 to 3.115.

{"title":"MWPRFN: Multilevel Wavelet Pyramid Recurrent Fusion Network for underwater image enhancement","authors":"Jinzhang Li, Jue Wang, Bo Li, Hangfan Gu","doi":"10.1016/j.displa.2025.103050","DOIUrl":"10.1016/j.displa.2025.103050","url":null,"abstract":"<div><div>Underwater images often suffer from color distortion, blurry details, and low contrast due to light scattering and water-type changes. Existing methods mainly focus on spatial information and ignore frequency-difference processing, which hinders the solution to the mixing degradation problem. To overcome these challenges, we propose a multi-scale wavelet pyramid recurrent fusion network (MWPRFN). This network retains low-frequency features at all levels, integrates them into a low-frequency enhancement branch, and fuses image features using a multi-scale dynamic cross-layer mechanism (DCLM) to capture the correlation between high and low frequencies. Each stage of the multi-level framework consists of a multi-frequency information interaction pyramid network (MFIPN) and an atmospheric light compensation estimation network (ALCEN). The low-frequency branch of the MFIPN enhances global details through an efficient context refinement module (ECRM). In contrast, the high-frequency branch extracts texture and edge features through a multi-scale difference expansion module (MSDC). After the inverse wavelet transform, ALCEN uses atmospheric light estimation and frequency domain compensation to compensate for color distortion. Experimental results show that MWPRFN significantly improves the quality of underwater images on five benchmark datasets. Compared with state-of-the-art methods, objective image quality metrics including PSNR, SSIM, and NIQE are improved by an average of 3.45%, 1.32%, and 4.50% respectively. Specifically, PSNR increased from 24.03 decibels to 24.86 decibels, SSIM increased from 0.9002 to 0.9121, and NIQE decreased from 3.261 to 3.115.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103050"},"PeriodicalIF":3.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Layer Cross-Modal Prompt Fusion for No-Reference Image Quality Assessment

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-04-02 DOI: 10.1016/j.displa.2025.103045

Yang Lu , Zilu Zhou , Zifan Yang , Shuangyao Han , Xiaoheng Jiang , Mingliang Xu

No-Reference Image Quality Assessment (NR-IQA) predicts image quality without reference images and exhibits high consistency with human visual perception. Multi-modal approaches based on vision-language (VL) models, like CLIP, have demonstrated remarkable generalization capabilities in NR-IQA tasks. While prompt learning has improved CLIP’s adaptation to downstream tasks, existing methods often lack synergy between textual and visual prompts, limiting their ability to capture complex cross-modal semantics. In response to this limitation, this paper proposes an innovative framework named MCPF-IQA with multi-layer cross-modal prompt fusion to further enhance the performance of CLIP model on NR-IQA tasks. Specifically, we introduce multi-layer prompt learning in both the text and visual branches of CLIP to improve the model’s comprehension of visual features and image quality. Additionally, we design a novel cross-modal prompt fusion module that deeply integrates text and visual prompts to enhance the accuracy of image quality assessment. We also develop five auxiliary quality-related category labels to describe image quality more precisely. Experimental results demonstrate MCPF-IQA model delivers exceptional performance on natural image datasets, with SRCC of 0.988 on the LIVE dataset (1.8% higher than the second-best method) and 0.913 on the LIVEC dataset (1.0% superior to the second-best method). Furthermore, it also exhibits strong performance on AI-generated image datasets. Ablation study results demonstrate the effectiveness and advantages of our method.

{"title":"Multi-Layer Cross-Modal Prompt Fusion for No-Reference Image Quality Assessment","authors":"Yang Lu , Zilu Zhou , Zifan Yang , Shuangyao Han , Xiaoheng Jiang , Mingliang Xu","doi":"10.1016/j.displa.2025.103045","DOIUrl":"10.1016/j.displa.2025.103045","url":null,"abstract":"<div><div>No-Reference Image Quality Assessment (NR-IQA) predicts image quality without reference images and exhibits high consistency with human visual perception. Multi-modal approaches based on vision-language (VL) models, like CLIP, have demonstrated remarkable generalization capabilities in NR-IQA tasks. While prompt learning has improved CLIP’s adaptation to downstream tasks, existing methods often lack synergy between textual and visual prompts, limiting their ability to capture complex cross-modal semantics. In response to this limitation, this paper proposes an innovative framework named MCPF-IQA with multi-layer cross-modal prompt fusion to further enhance the performance of CLIP model on NR-IQA tasks. Specifically, we introduce multi-layer prompt learning in both the text and visual branches of CLIP to improve the model’s comprehension of visual features and image quality. Additionally, we design a novel cross-modal prompt fusion module that deeply integrates text and visual prompts to enhance the accuracy of image quality assessment. We also develop five auxiliary quality-related category labels to describe image quality more precisely. Experimental results demonstrate MCPF-IQA model delivers exceptional performance on natural image datasets, with SRCC of 0.988 on the LIVE dataset (1.8% higher than the second-best method) and 0.913 on the LIVEC dataset (1.0% superior to the second-best method). Furthermore, it also exhibits strong performance on AI-generated image datasets. Ablation study results demonstrate the effectiveness and advantages of our method.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103045"},"PeriodicalIF":3.7,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Misleading Supervision Removal Mechanism for self-supervised monocular depth estimation

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-04-01 DOI: 10.1016/j.displa.2025.103043

Xinzhou Fan, Jinze Xu, Feng Ye, Yizong Lai

Self-supervised monocular depth estimation leverages the photometric consistency assumption and exploits geometric relations between image frames to convert depth errors into reprojection photometric errors. This allows the model train effectively without explicit depth labels. However, due to factors such as the incomplete validity of the photometric consistency assumption, inaccurate geometric relationships between image frames, and sensor noise, there are limitations to photometric error loss, which can easily introduce inaccurate supervision information and mislead the model into local optimal solutions. To address this issue, this paper introduces a Misleading Supervision Removal Mechanism(MSRM), aimed at enhancing the accuracy of supervisory information by eliminating misleading cues. MSRM employs a composite masking strategy that incorporates both pixel-level and image-level masks, where pixel-level masks include sky masks, edge masks, and edge consistency techniques. MSRM largely eliminate misleading supervision information introduced by sky regions, edge regions, and images with low viewpoint changes. Without altering network architecture, MSRM ensures no increase in inference time, making it a plug-and-play solution. Implemented across various self-supervised monocular depth estimation algorithms, experiments on KITTI, Cityscapes, and Make3D datasets demonstrate that MSRM significantly improves the prediction accuracy and generalization performance of the original algorithms.

{"title":"Misleading Supervision Removal Mechanism for self-supervised monocular depth estimation","authors":"Xinzhou Fan, Jinze Xu, Feng Ye, Yizong Lai","doi":"10.1016/j.displa.2025.103043","DOIUrl":"10.1016/j.displa.2025.103043","url":null,"abstract":"<div><div>Self-supervised monocular depth estimation leverages the photometric consistency assumption and exploits geometric relations between image frames to convert depth errors into reprojection photometric errors. This allows the model train effectively without explicit depth labels. However, due to factors such as the incomplete validity of the photometric consistency assumption, inaccurate geometric relationships between image frames, and sensor noise, there are limitations to photometric error loss, which can easily introduce inaccurate supervision information and mislead the model into local optimal solutions. To address this issue, this paper introduces a Misleading Supervision Removal Mechanism(MSRM), aimed at enhancing the accuracy of supervisory information by eliminating misleading cues. MSRM employs a composite masking strategy that incorporates both pixel-level and image-level masks, where pixel-level masks include sky masks, edge masks, and edge consistency techniques. MSRM largely eliminate misleading supervision information introduced by sky regions, edge regions, and images with low viewpoint changes. Without altering network architecture, MSRM ensures no increase in inference time, making it a plug-and-play solution. Implemented across various self-supervised monocular depth estimation algorithms, experiments on KITTI, Cityscapes, and Make3D datasets demonstrate that MSRM significantly improves the prediction accuracy and generalization performance of the original algorithms.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103043"},"PeriodicalIF":3.7,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Continuous detail enhancement framework for low-light image enhancement

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-27 DOI: 10.1016/j.displa.2025.103040

Kang Liu, Zhihao Xv, Zhe Yang, Lian Liu, Xinyu Li, Xiaopeng Hu

Low-light image enhancement is a crucial task for improving image quality in scenarios such as nighttime surveillance, autonomous driving at twilight, and low-light photography. Existing enhancement methods often focus on directly increasing brightness and contrast but neglect the importance of structural information, leading to information loss. In this paper, we propose a Continuous Detail Enhancement Framework for low-light image enhancement, termed as C-DEF. More specifically, we design an enhanced U-Net network that leverages dense connections to promote feature propagation to maintain consistency within the feature space and better preserve image details. Then, multi-perspective fusion enhancement module (MPFEM) is proposed to capture image features from multiple perspectives and further address the problem of feature space discontinuity. Moreover, an elaborate loss function drives the network to preserve critical information to achieve excess performance improvement. Extensive experiments on various benchmarks demonstrate the superiority of our method over state-of-the-art alternatives in both qualitative and quantitative evaluations. In addition, promising outcomes have been obtained by directly applying the trained model to the coal-rock dataset, indicating the model’s excellent generalization capability. The code is publicly available at https://github.com/xv994/C-DEF.

{"title":"Continuous detail enhancement framework for low-light image enhancement","authors":"Kang Liu, Zhihao Xv, Zhe Yang, Lian Liu, Xinyu Li, Xiaopeng Hu","doi":"10.1016/j.displa.2025.103040","DOIUrl":"10.1016/j.displa.2025.103040","url":null,"abstract":"<div><div>Low-light image enhancement is a crucial task for improving image quality in scenarios such as nighttime surveillance, autonomous driving at twilight, and low-light photography. Existing enhancement methods often focus on directly increasing brightness and contrast but neglect the importance of structural information, leading to information loss. In this paper, we propose a Continuous Detail Enhancement Framework for low-light image enhancement, termed as C-DEF. More specifically, we design an enhanced U-Net network that leverages dense connections to promote feature propagation to maintain consistency within the feature space and better preserve image details. Then, multi-perspective fusion enhancement module (MPFEM) is proposed to capture image features from multiple perspectives and further address the problem of feature space discontinuity. Moreover, an elaborate loss function drives the network to preserve critical information to achieve excess performance improvement. Extensive experiments on various benchmarks demonstrate the superiority of our method over state-of-the-art alternatives in both qualitative and quantitative evaluations. In addition, promising outcomes have been obtained by directly applying the trained model to the coal-rock dataset, indicating the model’s excellent generalization capability. The code is publicly available at <span><span>https://github.com/xv994/C-DEF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103040"},"PeriodicalIF":3.7,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

U-TransCNN: A U-shape transformer-CNN fusion model for underwater image enhancement

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-27 DOI: 10.1016/j.displa.2025.103047

Yao Haiyang , Guo Ruige , Zhao Zhongda , Zang Yuzhang , Zhao Xiaobo , Lei Tao , Wang Haiyan

Underwater imaging faces significant challenges due to nonuniform optical absorption and scattering, resulting in visual quality issues like color distortion, contrast reduction, and image blurring. These factors hinder the accurate capture and clear depiction of underwater imagery. To address these complexities, we propose U-TransCNN, a U-shape Transformer- Convolutional Neural Networks (CNN) model, designed to enhance underwater images by integrating the strengths of CNNs and Transformers. The core of U-TransCNN is the Global-Detail Feature Synchronization Fusion Module. This innovative component enhances global color and contrast while meticulously preserving the intricate texture details, ensuring that both macroscopic and microscopic aspects of the image are enhanced in unison. Then we design the Multiscale Detail Fusion Block to aggregate a richer spectrum of feature information using a variety of convolution kernels. Furthermore, our optimization strategy is augmented with a joint loss function, adynamic approach allowing the model to assign varying weights to the loss associated with different pixel points, depending on their loss magnitude. Six experiments (including reference and non-reference) on three public underwater datasets confirm that U-TransCNN comprehensively surpasses other contemporary state-of-the-art deep learning algorithms, demonstrating marked improvement in visualization quality and quantization parameters of underwater images. Our code is available at https://github.com/GuoRuige/UTransCNN.

由于不均匀的光学吸收和散射，水下成像面临着巨大的挑战，导致视觉质量问题，如色彩失真、对比度降低和图像模糊。这些因素阻碍了水下图像的准确捕捉和清晰描绘。为了解决这些复杂问题，我们提出了 U-TransCNN 模型，这是一种 U 型变形器-卷积神经网络（CNN）模型，旨在通过整合 CNN 和变形器的优势来增强水下图像。U-TransCNN 的核心是全局细节特征同步融合模块。这一创新组件在增强全局色彩和对比度的同时，还细致地保留了复杂的纹理细节，确保图像的宏观和微观方面得到统一增强。然后，我们设计了多尺度细节融合模块，利用各种卷积核汇聚更丰富的特征信息。此外，我们的优化策略还增加了联合损失函数，这种动态方法允许模型根据不同像素点的损失大小，为其相关损失分配不同的权重。在三个公开水下数据集上进行的六次实验（包括参考和非参考）证实，U-TransCNN 全面超越了其他当代最先进的深度学习算法，在水下图像的可视化质量和量化参数方面都有明显改善。我们的代码见 https://github.com/GuoRuige/UTransCNN。

{"title":"U-TransCNN: A U-shape transformer-CNN fusion model for underwater image enhancement","authors":"Yao Haiyang , Guo Ruige , Zhao Zhongda , Zang Yuzhang , Zhao Xiaobo , Lei Tao , Wang Haiyan","doi":"10.1016/j.displa.2025.103047","DOIUrl":"10.1016/j.displa.2025.103047","url":null,"abstract":"<div><div>Underwater imaging faces significant challenges due to nonuniform optical absorption and scattering, resulting in visual quality issues like color distortion, contrast reduction, and image blurring. These factors hinder the accurate capture and clear depiction of underwater imagery. To address these complexities, we propose U-TransCNN, a U-shape Transformer- Convolutional Neural Networks (CNN) model, designed to enhance underwater images by integrating the strengths of CNNs and Transformers. The core of U-TransCNN is the Global-Detail Feature Synchronization Fusion Module. This innovative component enhances global color and contrast while meticulously preserving the intricate texture details, ensuring that both macroscopic and microscopic aspects of the image are enhanced in unison. Then we design the Multiscale Detail Fusion Block to aggregate a richer spectrum of feature information using a variety of convolution kernels. Furthermore, our optimization strategy is augmented with a joint loss function, adynamic approach allowing the model to assign varying weights to the loss associated with different pixel points, depending on their loss magnitude. Six experiments (including reference and non-reference) on three public underwater datasets confirm that U-TransCNN comprehensively surpasses other contemporary state-of-the-art deep learning algorithms, demonstrating marked improvement in visualization quality and quantization parameters of underwater images. Our code is available at <span><span>https://github.com/GuoRuige/UTransCNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103047"},"PeriodicalIF":3.7,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143738891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A halo effect measurement for transparent micro-LED display under simulation

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-25 DOI: 10.1016/j.displa.2025.103006

Kerui Xi , Zijian Chen , Pengfei Wang , Feifei An , Wenqi Zhou , Xin Li , Tianyi Wu , Feng Qin , Xuhui Peng

Halo effects, characterized by luminous rings or diffuse light surrounding bright objects on a screen, are influenced by factors such as display technology, ambient lighting conditions, and the human visual system. These effects can degrade image quality, reduce contrast, and impair the accuracy of visual tasks, making them a critical area of investigation in both academia and industry. Current halo measurement methods mainly focus on the mini-LED partition backlit liquid crystal displays (LCDs) and are nearly all physical-based, limiting their application and generalization ability. In this paper, we propose a simulation-based halo measurement framework, which offers a more flexible evaluation scheme with scene-fused halo generation. Specifically, we first simulate the four existing categories of halo manifestation externally through code, i.e., regular ring-shaped halo, rectangular halo, irregular content-dependent halo, and localized surrounding halo. Then, we build a 3D indoor environment to simulate the laboratory measurement environment, where a micro-LED-based display with variable parameters is taken as the evaluation object. To demonstrate the usefulness of this new and unique halo measurement resource, we conduct both subjective and objective experiments. The high subjective and objective consistency achieved by a simple deep learning-based image quality assessment (IQA) model demonstrates its utility that broadens the scene limits of display design.

光晕效应的特征是屏幕上明亮物体周围的发光环或漫射光，受显示技术、环境照明条件和人类视觉系统等因素的影响。这些效应会降低图像质量、降低对比度并影响视觉任务的准确性，因此成为学术界和工业界研究的一个重要领域。目前的光晕测量方法主要集中在微型 LED 分区背光液晶显示器 (LCD)，几乎都是基于物理方法，限制了其应用和推广能力。在本文中，我们提出了一种基于模拟的光晕测量框架，它通过场景融合光晕生成提供了一种更灵活的评估方案。具体来说，我们首先通过代码从外部模拟现有的四类光环表现形式，即规则环形光环、矩形光环、不规则内容光环和局部周围光环。然后，我们构建了一个三维室内环境来模拟实验室测量环境，并以参数可变的微型 LED 显示器作为评估对象。为了证明这种新颖独特的光晕测量资源的实用性，我们进行了主观和客观实验。基于深度学习的简单图像质量评估（IQA）模型实现了高度的主观和客观一致性，证明了它的实用性，拓宽了显示屏设计的场景限制。

{"title":"A halo effect measurement for transparent micro-LED display under simulation","authors":"Kerui Xi , Zijian Chen , Pengfei Wang , Feifei An , Wenqi Zhou , Xin Li , Tianyi Wu , Feng Qin , Xuhui Peng","doi":"10.1016/j.displa.2025.103006","DOIUrl":"10.1016/j.displa.2025.103006","url":null,"abstract":"<div><div>Halo effects, characterized by luminous rings or diffuse light surrounding bright objects on a screen, are influenced by factors such as display technology, ambient lighting conditions, and the human visual system. These effects can degrade image quality, reduce contrast, and impair the accuracy of visual tasks, making them a critical area of investigation in both academia and industry. Current halo measurement methods mainly focus on the mini-LED partition backlit liquid crystal displays (LCDs) and are nearly all physical-based, limiting their application and generalization ability. In this paper, we propose a simulation-based halo measurement framework, which offers a more flexible evaluation scheme with scene-fused halo generation. Specifically, we first simulate the four existing categories of halo manifestation externally through code, i.e., regular ring-shaped halo, rectangular halo, irregular content-dependent halo, and localized surrounding halo. Then, we build a 3D indoor environment to simulate the laboratory measurement environment, where a micro-LED-based display with variable parameters is taken as the evaluation object. To demonstrate the usefulness of this new and unique halo measurement resource, we conduct both subjective and objective experiments. The high subjective and objective consistency achieved by a simple deep learning-based image quality assessment (IQA) model demonstrates its utility that broadens the scene limits of display design.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103006"},"PeriodicalIF":3.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143705697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A dataset and model for the readability assessment of transparent micro-LED displays in smart cockpit

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-25 DOI: 10.1016/j.displa.2025.103021

Feng Qin , Zijian Chen , Pengfei Wang , Peixuan Chen , Feifei An , Meng Wang , Sitao Huo , Tianyi Wu , Kerui Xi , Xuhui Peng

In the increasingly advanced automotive smart cockpit, the requirements for display with high readability and pleasant viewing experiences under various viewing directions, lead to significant challenges in product manufacture and modification. The micro light-emitting diode (micro-LED) display which has outstanding features, such as low power consumption, wider color gamut, longer lifetime, and small chip size, makes it a perfect candidate to design next-generation immersion vehicle display. However, the wide range of in- and out-vehicle lighting conditions that these displays should be able to operate in, makes the design of the evaluation set-up even more challenging. In this paper, we investigate a novel simulation-based evaluation framework for transparent micro-LED displays. Specifically, we collect the first display readability dataset by conducting comprehensive subjective studies. Based on this, we propose a novel objective display readability assessment model, which is comprised of three branches that are designed to extract readability-related features including scene semantics, technical distortions, and salient screen regions. In the experiments, we evaluate various blind image quality assessment algorithms, including both handcrafted feature-based models and deep learning-based models, on the proposed display readability dataset. The results show the effectiveness of our proposed objective display readability evaluator that achieves better subjective consistency than other baselines. The ablation studies further demonstrate the effectiveness of the proposed multi-branch feature extraction strategy and the image pre-processing scheme to filter out readability irrelevant information.

{"title":"A dataset and model for the readability assessment of transparent micro-LED displays in smart cockpit","authors":"Feng Qin , Zijian Chen , Pengfei Wang , Peixuan Chen , Feifei An , Meng Wang , Sitao Huo , Tianyi Wu , Kerui Xi , Xuhui Peng","doi":"10.1016/j.displa.2025.103021","DOIUrl":"10.1016/j.displa.2025.103021","url":null,"abstract":"<div><div>In the increasingly advanced automotive smart cockpit, the requirements for display with high readability and pleasant viewing experiences under various viewing directions, lead to significant challenges in product manufacture and modification. The micro light-emitting diode (micro-LED) display which has outstanding features, such as low power consumption, wider color gamut, longer lifetime, and small chip size, makes it a perfect candidate to design next-generation immersion vehicle display. However, the wide range of in- and out-vehicle lighting conditions that these displays should be able to operate in, makes the design of the evaluation set-up even more challenging. In this paper, we investigate a novel simulation-based evaluation framework for transparent micro-LED displays. Specifically, we collect the first display readability dataset by conducting comprehensive subjective studies. Based on this, we propose a novel objective display readability assessment model, which is comprised of three branches that are designed to extract readability-related features including scene semantics, technical distortions, and salient screen regions. In the experiments, we evaluate various blind image quality assessment algorithms, including both handcrafted feature-based models and deep learning-based models, on the proposed display readability dataset. The results show the effectiveness of our proposed objective display readability evaluator that achieves better subjective consistency than other baselines. The ablation studies further demonstrate the effectiveness of the proposed multi-branch feature extraction strategy and the image pre-processing scheme to filter out readability irrelevant information.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103021"},"PeriodicalIF":3.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143705698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing real-time UHD intra-frame coding with parallel–serial hybrid neural networks

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-25 DOI: 10.1016/j.displa.2025.103034

Yucheng Jiang , Songping Mai , Peng Zhang , Junwei Hu , Jie Yu , Jian Cheng

The primary objective of a video encoder is to achieve both high real-time performance and a high compression ratio. Delivering these capabilities in a cost-effective hardware environment is crucial for practical applications. Numerous institutions have developed highly-optimized implementations for the mainstream video coding standards, such as x265 for HEVC, VVenC for VVC, and uAVS3e for AVS3. However, these implementations are still not capable of performing real-time encoding of 4K/8K UHD videos without significantly reducing compression complexity. This paper presents a parallel–serial hybrid neural network scheme, specifically tailored to expedite intra-frame block partitioning decisions. The parallel network is designed to extract effective features while minimizing the impact of network inference time. Simultaneously, the lightweight serial network effectively overcomes accuracy issue related to the data dependency introduced by the reconstructed pixels. The proposed enhancement scheme is integrated into the uAVS3 Real-Time encoder. The experimental results for CTC 4K UHD sequences demonstrate a significant increase in encoding speed (+30.2%) and an improvement in encoding quality, as evidenced by a 0.24% reduction in BD-BR. Compared to the previous work, we achieve the optimal trade-off in these two critical metrics. Furthermore, we integrated the enhanced encoder into the FFmpeg framework, enabling an efficient video encoding system capable of achieving 4K@50FPS and 8K@9FPS on affordable hardware configurations.

{"title":"Enhancing real-time UHD intra-frame coding with parallel–serial hybrid neural networks","authors":"Yucheng Jiang , Songping Mai , Peng Zhang , Junwei Hu , Jie Yu , Jian Cheng","doi":"10.1016/j.displa.2025.103034","DOIUrl":"10.1016/j.displa.2025.103034","url":null,"abstract":"<div><div>The primary objective of a video encoder is to achieve both high real-time performance and a high compression ratio. Delivering these capabilities in a cost-effective hardware environment is crucial for practical applications. Numerous institutions have developed highly-optimized implementations for the mainstream video coding standards, such as x265 for HEVC, VVenC for VVC, and uAVS3e for AVS3. However, these implementations are still not capable of performing real-time encoding of 4K/8K UHD videos without significantly reducing compression complexity. This paper presents a parallel–serial hybrid neural network scheme, specifically tailored to expedite intra-frame block partitioning decisions. The parallel network is designed to extract effective features while minimizing the impact of network inference time. Simultaneously, the lightweight serial network effectively overcomes accuracy issue related to the data dependency introduced by the reconstructed pixels. The proposed enhancement scheme is integrated into the uAVS3 Real-Time encoder. The experimental results for CTC 4K UHD sequences demonstrate a significant increase in encoding speed (+30.2%) and an improvement in encoding quality, as evidenced by a 0.24% reduction in BD-BR. Compared to the previous work, we achieve the optimal trade-off in these two critical metrics. Furthermore, we integrated the enhanced encoder into the FFmpeg framework, enabling an efficient video encoding system capable of achieving 4K@50FPS and 8K@9FPS on affordable hardware configurations.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103034"},"PeriodicalIF":3.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Low temperature polysilicon pixel circuits for active-matrix digital microfluidic chips

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-25 DOI: 10.1016/j.displa.2025.103048

Shengzhe Jiang , Dongping Wang , Hanbin Ma , Arokia Nathan , Jun Yu

Emerging active-matrix digital microfluidic imposes stringent requirements on driving signals, yet conventional pixel circuits used in flat panel displays struggle to provide the necessary voltage. In this paper, we present three novel pixel circuits for active-matrix digital microfluidic chips: the 3T1C, Inverter, and SRAM designs. All proposed circuits are fabricated with low-temperature polysilicon technology, which is compatible with mainstream display backplane manufacturing processes. The characteristics and positive bias stability of the implemented thin-film transistors have been validated. Additionally, a comparative analysis is conducted against existing pixel designs. The circuits’ performance was evaluated under varying Gate pulse widths and driving voltages to determine the optimal driving strategy. With the appropriate driving voltage, the SRAM structure is capable of achieving an output exceeding 20 V. Moreover, the holding time, long-term operational stability, and illumination stability of the circuits were evaluated and compared. Experimental results demonstrate that the SRAM structure outperforms in both output performance and stability. An active-matrix digital microfluidic chip with a 640 × 280 array was fabricated using the SRAM pixel structure, which highlights its scalability. The simplified circuit structure, coupled with an output voltage exceeding 20 V, addresses the limitations of conventional pixel circuits for digital microfluidic applications. These novel designs offer innovative and reliable driving solutions for LTPS-based active-matrix digital microfluidics systems, further advancing the application of display technology in non-display fields.

{"title":"Low temperature polysilicon pixel circuits for active-matrix digital microfluidic chips","authors":"Shengzhe Jiang , Dongping Wang , Hanbin Ma , Arokia Nathan , Jun Yu","doi":"10.1016/j.displa.2025.103048","DOIUrl":"10.1016/j.displa.2025.103048","url":null,"abstract":"<div><div>Emerging active-matrix digital microfluidic imposes stringent requirements on driving signals, yet conventional pixel circuits used in flat panel displays struggle to provide the necessary voltage. In this paper, we present three novel pixel circuits for active-matrix digital microfluidic chips: the 3T1C, Inverter, and SRAM designs. All proposed circuits are fabricated with low-temperature polysilicon technology, which is compatible with mainstream display backplane manufacturing processes. The characteristics and positive bias stability of the implemented thin-film transistors have been validated. Additionally, a comparative analysis is conducted against existing pixel designs. The circuits’ performance was evaluated under varying Gate pulse widths and driving voltages to determine the optimal driving strategy. With the appropriate driving voltage, the SRAM structure is capable of achieving an output exceeding 20 V. Moreover, the holding time, long-term operational stability, and illumination stability of the circuits were evaluated and compared. Experimental results demonstrate that the SRAM structure outperforms in both output performance and stability. An active-matrix digital microfluidic chip with a 640 × 280 array was fabricated using the SRAM pixel structure, which highlights its scalability. The simplified circuit structure, coupled with an output voltage exceeding 20 V, addresses the limitations of conventional pixel circuits for digital microfluidic applications. These novel designs offer innovative and reliable driving solutions for LTPS-based active-matrix digital microfluidics systems, further advancing the application of display technology in non-display fields.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103048"},"PeriodicalIF":3.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient exploration of hard-to-find function in GUIs: A method and best practice

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Displays

Pub Date : 2025-03-25 DOI: 10.1016/j.displa.2025.103037

Xinglong Yin, Mengxi Zhang, Tengmei Wang, Huaxiao Liu

With the proliferation of mobile applications (apps), there has been a noticeable trend towards diversification in app functionalities to cater to evolving user needs and preferences. This evolution is evident in ongoing efforts towards app feature recommendation, reflecting a concerted endeavor to enhance user experience and satisfaction. However, the increasing complexity in app functionalities, particularly within the Graphical User Interface (GUI), presents significant challenges for users to find their desired functions. Further, by conducting an online survey, we found that 85% of participants encounter difficulties in locating desired functionalities within apps, which can lead to frustration and even app abandonment. To tackle this challenge, we propose an approach that leverages GUI screenshots and layout files to analyze app functions. Our approach involves vectorizing app functions based on user search times and function descriptions, followed by personalized analysis, initial difficulty assessment, and refinement through clustering techniques. To evaluate our method, we carry out experiments on 49 apps across 8 categories demonstrate the effectiveness of our approach. Our approach achieves an accuracy rate of 91.29% on average in identifying hard-to-find functions and observes significant performance improvements after reducing random data. Feedback from developers further confirms the practical utility of our approach in crafting user-friendly GUIs and minimizing the risk of crucial functions being overlooked.

随着移动应用程序（Apps）的激增，应用程序功能的多样化已成为一种明显的趋势，以满足不断变化的用户需求和偏好。这种演变体现在目前正在进行的应用程序功能推荐工作中，这反映了为提高用户体验和满意度而做出的共同努力。然而，应用程序功能的日益复杂化，尤其是图形用户界面（GUI）的复杂化，给用户寻找所需功能带来了巨大挑战。此外，通过在线调查，我们发现 85% 的参与者在查找应用程序中的所需功能时遇到困难，这可能导致挫败感，甚至放弃应用程序。为了应对这一挑战，我们提出了一种利用图形用户界面截图和布局文件来分析应用程序功能的方法。我们的方法包括根据用户搜索时间和功能描述对应用程序功能进行矢量化，然后进行个性化分析、初步难度评估，并通过聚类技术进行细化。为了评估我们的方法，我们在 8 个类别的 49 个应用程序上进行了实验，证明了我们方法的有效性。在识别难以发现的功能方面，我们的方法平均准确率达到 91.29%，而且在减少随机数据后，性能有了显著提高。开发人员的反馈进一步证实了我们的方法在制作用户友好的图形用户界面和最大限度地降低关键功能被忽视的风险方面的实用性。

{"title":"Efficient exploration of hard-to-find function in GUIs: A method and best practice","authors":"Xinglong Yin, Mengxi Zhang, Tengmei Wang, Huaxiao Liu","doi":"10.1016/j.displa.2025.103037","DOIUrl":"10.1016/j.displa.2025.103037","url":null,"abstract":"<div><div>With the proliferation of mobile applications (apps), there has been a noticeable trend towards diversification in app functionalities to cater to evolving user needs and preferences. This evolution is evident in ongoing efforts towards app feature recommendation, reflecting a concerted endeavor to enhance user experience and satisfaction. However, the increasing complexity in app functionalities, particularly within the Graphical User Interface (GUI), presents significant challenges for users to find their desired functions. Further, by conducting an online survey, we found that 85% of participants encounter difficulties in locating desired functionalities within apps, which can lead to frustration and even app abandonment. To tackle this challenge, we propose an approach that leverages GUI screenshots and layout files to analyze app functions. Our approach involves vectorizing app functions based on user search times and function descriptions, followed by personalized analysis, initial difficulty assessment, and refinement through clustering techniques. To evaluate our method, we carry out experiments on 49 apps across 8 categories demonstrate the effectiveness of our approach. Our approach achieves an accuracy rate of 91.29% on average in identifying hard-to-find functions and observes significant performance improvements after reducing random data. Feedback from developers further confirms the practical utility of our approach in crafting user-friendly GUIs and minimizing the risk of crucial functions being overlooked.</div></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"88 ","pages":"Article 103037"},"PeriodicalIF":3.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143705696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0