Using Spatial-Temporal Attention for Video Quality Evaluation

IF 3.7 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Intelligent Systems Pub Date : 2024-07-12 DOI:10.1155/2024/5514627

Biwei Chi, Ruifang Su, Xinhui Chen

{"title":"Using Spatial-Temporal Attention for Video Quality Evaluation","authors":"Biwei Chi, Ruifang Su, Xinhui Chen","doi":"10.1155/2024/5514627","DOIUrl":null,"url":null,"abstract":"<div>\n <p>With the rapid development of media, the role of video quality assessment (VQA) is becoming increasingly significant. VQA has applications in many domains. For example, in the field of remote medical diagnosis, it can enhance the quality of video communication between doctors and patients. Besides, in sports broadcasting, it can improve video clarity. Within VQA, the human visual system (HVS) is a crucial component that should be taken into consideration. Considering that attention is guided by goal-driven and top-down factors, such as anticipated locations or some attractive frames within the video, we propose a blind VQA algorithm based on spatial-temporal attention model. Specifically, we first use two pretrained convolutional networks to extract low-level static-dynamic fusion features. Then, a spatial attention-guided model is established to get more representative features of frame-level quality perception. Next, through a temporal attention-guided model, the video-level features are obtained. Finally, the features are fed into a regression model to calculate the final video quality score. The experiments conducted on seven VQA databases reach the state-of-the-art performance, demonstrating the effectiveness of our proposed method.</p>\n </div>","PeriodicalId":14089,"journal":{"name":"International Journal of Intelligent Systems","volume":"2024 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/2024/5514627","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/5514627","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid development of media, the role of video quality assessment (VQA) is becoming increasingly significant. VQA has applications in many domains. For example, in the field of remote medical diagnosis, it can enhance the quality of video communication between doctors and patients. Besides, in sports broadcasting, it can improve video clarity. Within VQA, the human visual system (HVS) is a crucial component that should be taken into consideration. Considering that attention is guided by goal-driven and top-down factors, such as anticipated locations or some attractive frames within the video, we propose a blind VQA algorithm based on spatial-temporal attention model. Specifically, we first use two pretrained convolutional networks to extract low-level static-dynamic fusion features. Then, a spatial attention-guided model is established to get more representative features of frame-level quality perception. Next, through a temporal attention-guided model, the video-level features are obtained. Finally, the features are fed into a regression model to calculate the final video quality score. The experiments conducted on seven VQA databases reach the state-of-the-art performance, demonstrating the effectiveness of our proposed method.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用时空注意力进行视频质量评估

随着媒体的飞速发展，视频质量评估（VQA）的作用越来越重要。视频质量评估在许多领域都有应用。例如，在远程医疗诊断领域，它可以提高医生和病人之间的视频通信质量。此外，在体育转播领域，它还能提高视频清晰度。在 VQA 中，人类视觉系统（HVS）是一个需要考虑的重要组成部分。考虑到注意力是由目标驱动和自上而下的因素引导的，例如视频中的预期位置或一些有吸引力的帧，我们提出了一种基于时空注意力模型的盲 VQA 算法。具体来说，我们首先使用两个预训练的卷积网络来提取低层次的静态-动态融合特征。然后，建立一个空间注意力引导模型，以获得更具代表性的帧级质量感知特征。接着，通过时间注意力引导模型，获得视频级特征。最后，将这些特征输入回归模型，计算出最终的视频质量得分。在七个 VQA 数据库上进行的实验达到了最先进的性能，证明了我们所提方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Intelligent Systems 工程技术-计算机：人工智能

CiteScore

11.30

自引率

14.30%

发文量

304

审稿时长

9 months

期刊介绍： The International Journal of Intelligent Systems serves as a forum for individuals interested in tapping into the vast theories based on intelligent systems construction. With its peer-reviewed format, the journal explores several fascinating editorials written by today''s experts in the field. Because new developments are being introduced each day, there''s much to be learned — examination, analysis creation, information retrieval, man–computer interactions, and more. The International Journal of Intelligent Systems uses charts and illustrations to demonstrate these ground-breaking issues, and encourages readers to share their thoughts and experiences.