{"title":"基于时空视觉灵敏度的全参考视频质量评价","authors":"Huiyuan Fu, Da Pan, Ping Shi","doi":"10.1109/ICCST53801.2021.00071","DOIUrl":null,"url":null,"abstract":"Video streaming services have become one of the important businesses of network service providers. Accurately predicting video perceptual quality score can help providing high-quality video services. Many video quality assessment (VQA) methods were trying to simulate human visual system (HVS) to get a better performance. In this paper, we proposed a full-reference video quality assessment (FR-VQA) method named DeepVQA-FBSA. Our method is based on spatiotemporal visual sensitivity. It firstly uses a convolutional neural network (CNN) to obtain the visual sensitivity maps of frames according to the input spatiotemporal information. Then visual sensitivity maps are used to obtain the perceptual features of every frame which we called frame-level features in this paper. The frame-level features are then feed into a Feature Based Self-attention (FBSA) module to fusion to the video-level features and used to predict the video quality score. The experimental results showed that the predicted results of our method have great consistency with the subjective evaluation results.","PeriodicalId":222463,"journal":{"name":"2021 International Conference on Culture-oriented Science & Technology (ICCST)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Full-Reference Video Quality Assessment Based on Spatiotemporal Visual Sensitivity\",\"authors\":\"Huiyuan Fu, Da Pan, Ping Shi\",\"doi\":\"10.1109/ICCST53801.2021.00071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video streaming services have become one of the important businesses of network service providers. Accurately predicting video perceptual quality score can help providing high-quality video services. Many video quality assessment (VQA) methods were trying to simulate human visual system (HVS) to get a better performance. In this paper, we proposed a full-reference video quality assessment (FR-VQA) method named DeepVQA-FBSA. Our method is based on spatiotemporal visual sensitivity. It firstly uses a convolutional neural network (CNN) to obtain the visual sensitivity maps of frames according to the input spatiotemporal information. Then visual sensitivity maps are used to obtain the perceptual features of every frame which we called frame-level features in this paper. The frame-level features are then feed into a Feature Based Self-attention (FBSA) module to fusion to the video-level features and used to predict the video quality score. The experimental results showed that the predicted results of our method have great consistency with the subjective evaluation results.\",\"PeriodicalId\":222463,\"journal\":{\"name\":\"2021 International Conference on Culture-oriented Science & Technology (ICCST)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Culture-oriented Science & Technology (ICCST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCST53801.2021.00071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Culture-oriented Science & Technology (ICCST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCST53801.2021.00071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Full-Reference Video Quality Assessment Based on Spatiotemporal Visual Sensitivity
Video streaming services have become one of the important businesses of network service providers. Accurately predicting video perceptual quality score can help providing high-quality video services. Many video quality assessment (VQA) methods were trying to simulate human visual system (HVS) to get a better performance. In this paper, we proposed a full-reference video quality assessment (FR-VQA) method named DeepVQA-FBSA. Our method is based on spatiotemporal visual sensitivity. It firstly uses a convolutional neural network (CNN) to obtain the visual sensitivity maps of frames according to the input spatiotemporal information. Then visual sensitivity maps are used to obtain the perceptual features of every frame which we called frame-level features in this paper. The frame-level features are then feed into a Feature Based Self-attention (FBSA) module to fusion to the video-level features and used to predict the video quality score. The experimental results showed that the predicted results of our method have great consistency with the subjective evaluation results.