{"title":"通过时空金字塔注意力进行盲目视频质量评估的方法","authors":"Wenhao Shen;Mingliang Zhou;Xuekai Wei;Heqiang Wang;Bin Fang;Cheng Ji;Xu Zhuang;Jason Wang;Jun Luo;Huayan Pu;Xiaoxu Huang;Shilong Wang;Huajun Cao;Yong Feng;Tao Xiang;Zhaowei Shang","doi":"10.1109/TBC.2023.3340031","DOIUrl":null,"url":null,"abstract":"As social media communication develops, reliable multimedia quality evaluation indicators have become a prerequisite for enriching user experience services. In this paper, we propose a multiscale spatiotemporal pyramid attention (SPA) block for constructing a blind video quality assessment (VQA) method to evaluate the perceptual quality of videos. First, we extract motion information from the video frames at different temporal scales to form a feature pyramid, which provides a feature representation with multiple visual perceptions. Second, an SPA module, which can effectively extract multiscale spatiotemporal information at various temporal scales and develop a cross-scale dependency relationship, is proposed. Finally, the quality estimation process is completed by passing the extracted features obtained from a network of multiple stacked spatiotemporal pyramid blocks through a regression network to determine the perceived quality. The experimental results demonstrate that our method is on par with the state-of-the-art approaches. The source code necessary for conducting groundbreaking scientific research is accessible online \n<uri>https://github.com/Land5cape/SPBVQA</uri>\n.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 1","pages":"251-264"},"PeriodicalIF":3.2000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Blind Video Quality Assessment Method via Spatiotemporal Pyramid Attention\",\"authors\":\"Wenhao Shen;Mingliang Zhou;Xuekai Wei;Heqiang Wang;Bin Fang;Cheng Ji;Xu Zhuang;Jason Wang;Jun Luo;Huayan Pu;Xiaoxu Huang;Shilong Wang;Huajun Cao;Yong Feng;Tao Xiang;Zhaowei Shang\",\"doi\":\"10.1109/TBC.2023.3340031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As social media communication develops, reliable multimedia quality evaluation indicators have become a prerequisite for enriching user experience services. In this paper, we propose a multiscale spatiotemporal pyramid attention (SPA) block for constructing a blind video quality assessment (VQA) method to evaluate the perceptual quality of videos. First, we extract motion information from the video frames at different temporal scales to form a feature pyramid, which provides a feature representation with multiple visual perceptions. Second, an SPA module, which can effectively extract multiscale spatiotemporal information at various temporal scales and develop a cross-scale dependency relationship, is proposed. Finally, the quality estimation process is completed by passing the extracted features obtained from a network of multiple stacked spatiotemporal pyramid blocks through a regression network to determine the perceived quality. The experimental results demonstrate that our method is on par with the state-of-the-art approaches. The source code necessary for conducting groundbreaking scientific research is accessible online \\n<uri>https://github.com/Land5cape/SPBVQA</uri>\\n.\",\"PeriodicalId\":13159,\"journal\":{\"name\":\"IEEE Transactions on Broadcasting\",\"volume\":\"70 1\",\"pages\":\"251-264\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2023-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Broadcasting\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10375568/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10375568/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
A Blind Video Quality Assessment Method via Spatiotemporal Pyramid Attention
As social media communication develops, reliable multimedia quality evaluation indicators have become a prerequisite for enriching user experience services. In this paper, we propose a multiscale spatiotemporal pyramid attention (SPA) block for constructing a blind video quality assessment (VQA) method to evaluate the perceptual quality of videos. First, we extract motion information from the video frames at different temporal scales to form a feature pyramid, which provides a feature representation with multiple visual perceptions. Second, an SPA module, which can effectively extract multiscale spatiotemporal information at various temporal scales and develop a cross-scale dependency relationship, is proposed. Finally, the quality estimation process is completed by passing the extracted features obtained from a network of multiple stacked spatiotemporal pyramid blocks through a regression network to determine the perceived quality. The experimental results demonstrate that our method is on par with the state-of-the-art approaches. The source code necessary for conducting groundbreaking scientific research is accessible online
https://github.com/Land5cape/SPBVQA
.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”