ADS-VQA：用于视频质量评估的自适应采样模型

IF 3.7 2区工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-07-04 DOI:10.1016/j.displa.2024.102792

Shuaibo Cheng, Xiaopeng Li, Zhaoyuan Zeng, Jia Yan

{"title":"ADS-VQA：用于视频质量评估的自适应采样模型","authors":"Shuaibo Cheng, Xiaopeng Li, Zhaoyuan Zeng, Jia Yan","doi":"10.1016/j.displa.2024.102792","DOIUrl":null,"url":null,"abstract":"<div><p>No-reference video quality assessment (NR-VQA) for user-generated content (UGC) plays a crucial role in ensuring the quality of video services. Although some works have achieved impressive results, their performance-complexity trade-off is still sub-optimal. On the one hand, overly complex network structures and additional inputs require more computing resources. On the other hand, the simple sampling methods have tended to overlook the temporal characteristics of the videos, resulting in the degradation of local textures and potential distortion of the thematic content, consequently leading to the performance decline of the VQA technologies. Therefore, in this paper, we propose an enhanced NR-VQA model, known as the Adaptive Sampling Strategy for Video Quality Assessment (ADS-VQA). Temporally, we conduct non-uniform sampling on videos utilizing features from the lateral geniculate nucleus (LGN) to capture the temporal characteristics of videos. Spatially, a dual-branch structure is designed to supplement spatial features across different levels. The one branch samples patches at their raw resolution, effectively preserving the local texture detail. The other branch performs a downsampling process guided by saliency cues, attaining global semantic features with a diminished computational expense. Experimental results demonstrate that the proposed approach achieves high performance at a lower computational cost than most state-of-the-art VQA models on four popular VQA databases.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102792"},"PeriodicalIF":3.7000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ADS-VQA: Adaptive sampling model for video quality assessment\",\"authors\":\"Shuaibo Cheng, Xiaopeng Li, Zhaoyuan Zeng, Jia Yan\",\"doi\":\"10.1016/j.displa.2024.102792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>No-reference video quality assessment (NR-VQA) for user-generated content (UGC) plays a crucial role in ensuring the quality of video services. Although some works have achieved impressive results, their performance-complexity trade-off is still sub-optimal. On the one hand, overly complex network structures and additional inputs require more computing resources. On the other hand, the simple sampling methods have tended to overlook the temporal characteristics of the videos, resulting in the degradation of local textures and potential distortion of the thematic content, consequently leading to the performance decline of the VQA technologies. Therefore, in this paper, we propose an enhanced NR-VQA model, known as the Adaptive Sampling Strategy for Video Quality Assessment (ADS-VQA). Temporally, we conduct non-uniform sampling on videos utilizing features from the lateral geniculate nucleus (LGN) to capture the temporal characteristics of videos. Spatially, a dual-branch structure is designed to supplement spatial features across different levels. The one branch samples patches at their raw resolution, effectively preserving the local texture detail. The other branch performs a downsampling process guided by saliency cues, attaining global semantic features with a diminished computational expense. Experimental results demonstrate that the proposed approach achieves high performance at a lower computational cost than most state-of-the-art VQA models on four popular VQA databases.</p></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"84 \",\"pages\":\"Article 102792\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938224001562\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001562","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

针对用户生成内容（UGC）的无参考视频质量评估（NR-VQA）在确保视频服务质量方面发挥着至关重要的作用。尽管一些研究取得了令人瞩目的成果，但其性能与复杂性之间的权衡仍未达到最佳状态。一方面，过于复杂的网络结构和额外的输入需要更多的计算资源。另一方面，简单的采样方法往往会忽略视频的时间特性，造成局部纹理的退化和主题内容的潜在失真，从而导致 VQA 技术的性能下降。因此，我们在本文中提出了一种增强型 NR-VQA 模型，即视频质量评估的自适应采样策略（ADS-VQA）。在时间上，我们利用外侧膝状核（LGN）的特征对视频进行非均匀采样，以捕捉视频的时间特征。在空间上，我们设计了一个双分支结构来补充不同层次的空间特征。一个分支以原始分辨率对补丁进行采样，有效地保留了局部纹理细节。另一个分支则在显著性线索的引导下执行降采样过程，从而以较低的计算成本获得全局语义特征。实验结果表明，在四个流行的 VQA 数据库上，与大多数最先进的 VQA 模型相比，所提出的方法以更低的计算成本实现了更高的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ADS-VQA: Adaptive sampling model for video quality assessment

No-reference video quality assessment (NR-VQA) for user-generated content (UGC) plays a crucial role in ensuring the quality of video services. Although some works have achieved impressive results, their performance-complexity trade-off is still sub-optimal. On the one hand, overly complex network structures and additional inputs require more computing resources. On the other hand, the simple sampling methods have tended to overlook the temporal characteristics of the videos, resulting in the degradation of local textures and potential distortion of the thematic content, consequently leading to the performance decline of the VQA technologies. Therefore, in this paper, we propose an enhanced NR-VQA model, known as the Adaptive Sampling Strategy for Video Quality Assessment (ADS-VQA). Temporally, we conduct non-uniform sampling on videos utilizing features from the lateral geniculate nucleus (LGN) to capture the temporal characteristics of videos. Spatially, a dual-branch structure is designed to supplement spatial features across different levels. The one branch samples patches at their raw resolution, effectively preserving the local texture detail. The other branch performs a downsampling process guided by saliency cues, attaining global semantic features with a diminished computational expense. Experimental results demonstrate that the proposed approach achieves high performance at a lower computational cost than most state-of-the-art VQA models on four popular VQA databases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Displays 工程技术-工程：电子与电气

CiteScore

4.60

自引率

25.60%

发文量

138

审稿时长

92 days

期刊介绍： Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.