ADS-VQA:用于视频质量评估的自适应采样模型

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-07-04 DOI:10.1016/j.displa.2024.102792
Shuaibo Cheng, Xiaopeng Li, Zhaoyuan Zeng, Jia Yan
{"title":"ADS-VQA:用于视频质量评估的自适应采样模型","authors":"Shuaibo Cheng,&nbsp;Xiaopeng Li,&nbsp;Zhaoyuan Zeng,&nbsp;Jia Yan","doi":"10.1016/j.displa.2024.102792","DOIUrl":null,"url":null,"abstract":"<div><p>No-reference video quality assessment (NR-VQA) for user-generated content (UGC) plays a crucial role in ensuring the quality of video services. Although some works have achieved impressive results, their performance-complexity trade-off is still sub-optimal. On the one hand, overly complex network structures and additional inputs require more computing resources. On the other hand, the simple sampling methods have tended to overlook the temporal characteristics of the videos, resulting in the degradation of local textures and potential distortion of the thematic content, consequently leading to the performance decline of the VQA technologies. Therefore, in this paper, we propose an enhanced NR-VQA model, known as the Adaptive Sampling Strategy for Video Quality Assessment (ADS-VQA). Temporally, we conduct non-uniform sampling on videos utilizing features from the lateral geniculate nucleus (LGN) to capture the temporal characteristics of videos. Spatially, a dual-branch structure is designed to supplement spatial features across different levels. The one branch samples patches at their raw resolution, effectively preserving the local texture detail. The other branch performs a downsampling process guided by saliency cues, attaining global semantic features with a diminished computational expense. Experimental results demonstrate that the proposed approach achieves high performance at a lower computational cost than most state-of-the-art VQA models on four popular VQA databases.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102792"},"PeriodicalIF":3.7000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ADS-VQA: Adaptive sampling model for video quality assessment\",\"authors\":\"Shuaibo Cheng,&nbsp;Xiaopeng Li,&nbsp;Zhaoyuan Zeng,&nbsp;Jia Yan\",\"doi\":\"10.1016/j.displa.2024.102792\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>No-reference video quality assessment (NR-VQA) for user-generated content (UGC) plays a crucial role in ensuring the quality of video services. Although some works have achieved impressive results, their performance-complexity trade-off is still sub-optimal. On the one hand, overly complex network structures and additional inputs require more computing resources. On the other hand, the simple sampling methods have tended to overlook the temporal characteristics of the videos, resulting in the degradation of local textures and potential distortion of the thematic content, consequently leading to the performance decline of the VQA technologies. Therefore, in this paper, we propose an enhanced NR-VQA model, known as the Adaptive Sampling Strategy for Video Quality Assessment (ADS-VQA). Temporally, we conduct non-uniform sampling on videos utilizing features from the lateral geniculate nucleus (LGN) to capture the temporal characteristics of videos. Spatially, a dual-branch structure is designed to supplement spatial features across different levels. The one branch samples patches at their raw resolution, effectively preserving the local texture detail. The other branch performs a downsampling process guided by saliency cues, attaining global semantic features with a diminished computational expense. Experimental results demonstrate that the proposed approach achieves high performance at a lower computational cost than most state-of-the-art VQA models on four popular VQA databases.</p></div>\",\"PeriodicalId\":50570,\"journal\":{\"name\":\"Displays\",\"volume\":\"84 \",\"pages\":\"Article 102792\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-07-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Displays\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141938224001562\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001562","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

摘要

针对用户生成内容(UGC)的无参考视频质量评估(NR-VQA)在确保视频服务质量方面发挥着至关重要的作用。尽管一些研究取得了令人瞩目的成果,但其性能与复杂性之间的权衡仍未达到最佳状态。一方面,过于复杂的网络结构和额外的输入需要更多的计算资源。另一方面,简单的采样方法往往会忽略视频的时间特性,造成局部纹理的退化和主题内容的潜在失真,从而导致 VQA 技术的性能下降。因此,我们在本文中提出了一种增强型 NR-VQA 模型,即视频质量评估的自适应采样策略(ADS-VQA)。在时间上,我们利用外侧膝状核(LGN)的特征对视频进行非均匀采样,以捕捉视频的时间特征。在空间上,我们设计了一个双分支结构来补充不同层次的空间特征。一个分支以原始分辨率对补丁进行采样,有效地保留了局部纹理细节。另一个分支则在显著性线索的引导下执行降采样过程,从而以较低的计算成本获得全局语义特征。实验结果表明,在四个流行的 VQA 数据库上,与大多数最先进的 VQA 模型相比,所提出的方法以更低的计算成本实现了更高的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ADS-VQA: Adaptive sampling model for video quality assessment

No-reference video quality assessment (NR-VQA) for user-generated content (UGC) plays a crucial role in ensuring the quality of video services. Although some works have achieved impressive results, their performance-complexity trade-off is still sub-optimal. On the one hand, overly complex network structures and additional inputs require more computing resources. On the other hand, the simple sampling methods have tended to overlook the temporal characteristics of the videos, resulting in the degradation of local textures and potential distortion of the thematic content, consequently leading to the performance decline of the VQA technologies. Therefore, in this paper, we propose an enhanced NR-VQA model, known as the Adaptive Sampling Strategy for Video Quality Assessment (ADS-VQA). Temporally, we conduct non-uniform sampling on videos utilizing features from the lateral geniculate nucleus (LGN) to capture the temporal characteristics of videos. Spatially, a dual-branch structure is designed to supplement spatial features across different levels. The one branch samples patches at their raw resolution, effectively preserving the local texture detail. The other branch performs a downsampling process guided by saliency cues, attaining global semantic features with a diminished computational expense. Experimental results demonstrate that the proposed approach achieves high performance at a lower computational cost than most state-of-the-art VQA models on four popular VQA databases.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
期刊最新文献
Mambav3d: A mamba-based virtual 3D module stringing semantic information between layers of medical image slices Luminance decomposition and Transformer based no-reference tone-mapped image quality assessment GLDBF: Global and local dual-branch fusion network for no-reference point cloud quality assessment Virtual reality in medical education: Effectiveness of Immersive Virtual Anatomy Laboratory (IVAL) compared to traditional learning approaches Weighted ensemble deep learning approach for classification of gastrointestinal diseases in colonoscopy images aided by explainable AI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1