{"title":"基于稳定视口的无监督压缩 360$^{\\circ}$ 视频质量增强技术","authors":"Zizhuang Zou;Mao Ye;Xue Li;Luping Ji;Ce Zhu","doi":"10.1109/TBC.2024.3380435","DOIUrl":null,"url":null,"abstract":"With the popularity of panoramic cameras and head mount displays, many 360° videos have been recorded. Due to the geometric distortion and boundary discontinuity of 2D projection of 360° video, traditional 2D lossy video compression technology always generates more artifacts. Therefore, it is necessary to enhance the quality of compressed 360° video. However, 360° video characteristics make traditional 2D enhancement models cannot work properly. So the previous work tries to obtain the viewport sequence with smaller geometric distortions for enhancement. But such sequence is difficult to be obtained and the trained enhancement model cannot be well adapted to a new dataset. To address these issues, we propose a Stable viewport-based Unsupervised compressed 360° video Quality Enhancement (SUQE) method. Our method consists of two stages. In the first stage, a new data preparation module is proposed which adopts saliency-based data augmentation and viewport cropping techniques to generate training dataset. A standard 2D enhancement model is trained based on this dataset. For transferring the trained enhancement model to the target dataset, a shift prediction module is designed, which will crop a shifted viewport clip as supervision signal for model adaptation. For the second stage, by comparing the differences between the current enhanced original and shifted frames, the Mean Teacher framework is employed to further fine-tune the enhancement model. Experiment results confirm that our method achieves satisfactory performance on the public dataset. The relevant models and code will be released.","PeriodicalId":13159,"journal":{"name":"IEEE Transactions on Broadcasting","volume":"70 2","pages":"607-619"},"PeriodicalIF":3.2000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stable Viewport-Based Unsupervised Compressed 360° Video Quality Enhancement\",\"authors\":\"Zizhuang Zou;Mao Ye;Xue Li;Luping Ji;Ce Zhu\",\"doi\":\"10.1109/TBC.2024.3380435\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the popularity of panoramic cameras and head mount displays, many 360° videos have been recorded. Due to the geometric distortion and boundary discontinuity of 2D projection of 360° video, traditional 2D lossy video compression technology always generates more artifacts. Therefore, it is necessary to enhance the quality of compressed 360° video. However, 360° video characteristics make traditional 2D enhancement models cannot work properly. So the previous work tries to obtain the viewport sequence with smaller geometric distortions for enhancement. But such sequence is difficult to be obtained and the trained enhancement model cannot be well adapted to a new dataset. To address these issues, we propose a Stable viewport-based Unsupervised compressed 360° video Quality Enhancement (SUQE) method. Our method consists of two stages. In the first stage, a new data preparation module is proposed which adopts saliency-based data augmentation and viewport cropping techniques to generate training dataset. A standard 2D enhancement model is trained based on this dataset. For transferring the trained enhancement model to the target dataset, a shift prediction module is designed, which will crop a shifted viewport clip as supervision signal for model adaptation. For the second stage, by comparing the differences between the current enhanced original and shifted frames, the Mean Teacher framework is employed to further fine-tune the enhancement model. Experiment results confirm that our method achieves satisfactory performance on the public dataset. The relevant models and code will be released.\",\"PeriodicalId\":13159,\"journal\":{\"name\":\"IEEE Transactions on Broadcasting\",\"volume\":\"70 2\",\"pages\":\"607-619\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Broadcasting\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10496132/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Broadcasting","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10496132/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Stable Viewport-Based Unsupervised Compressed 360° Video Quality Enhancement
With the popularity of panoramic cameras and head mount displays, many 360° videos have been recorded. Due to the geometric distortion and boundary discontinuity of 2D projection of 360° video, traditional 2D lossy video compression technology always generates more artifacts. Therefore, it is necessary to enhance the quality of compressed 360° video. However, 360° video characteristics make traditional 2D enhancement models cannot work properly. So the previous work tries to obtain the viewport sequence with smaller geometric distortions for enhancement. But such sequence is difficult to be obtained and the trained enhancement model cannot be well adapted to a new dataset. To address these issues, we propose a Stable viewport-based Unsupervised compressed 360° video Quality Enhancement (SUQE) method. Our method consists of two stages. In the first stage, a new data preparation module is proposed which adopts saliency-based data augmentation and viewport cropping techniques to generate training dataset. A standard 2D enhancement model is trained based on this dataset. For transferring the trained enhancement model to the target dataset, a shift prediction module is designed, which will crop a shifted viewport clip as supervision signal for model adaptation. For the second stage, by comparing the differences between the current enhanced original and shifted frames, the Mean Teacher framework is employed to further fine-tune the enhancement model. Experiment results confirm that our method achieves satisfactory performance on the public dataset. The relevant models and code will be released.
期刊介绍:
The Society’s Field of Interest is “Devices, equipment, techniques and systems related to broadcast technology, including the production, distribution, transmission, and propagation aspects.” In addition to this formal FOI statement, which is used to provide guidance to the Publications Committee in the selection of content, the AdCom has further resolved that “broadcast systems includes all aspects of transmission, propagation, and reception.”