Salima Bourbia, Ayoub Karine, A. Chetouani, M. El Hassouni, M. Jridi
{"title":"No-reference Point Clouds Quality Assessment using Transformer and Visual Saliency","authors":"Salima Bourbia, Ayoub Karine, A. Chetouani, M. El Hassouni, M. Jridi","doi":"10.1145/3552469.3555713","DOIUrl":null,"url":null,"abstract":"Quality estimation of 3D objects/scenes represented by cloud point is a crucial and challenging task in computer vision. In real-world applications, reference data is not always available, which motivates the development of new point cloud quality assessment (PCQA) metrics that do not require the original 3D point cloud (3DPC). This family of methods is called no-reference or blind PCQA. In this context, we propose a deep-learning-based approach that benefits from the advantage of the self-attention mechanism in transformers to accurately predict the perceptual quality score for each degraded 3DPC. Additionally, we introduce the use of saliency maps to reflect the human visual system behavior that is attracted to some specific regions compared to others during the evaluation. To this end, we first render 2D projections (i.e. views) of a 3DPC from different viewpoints. Then, we weight the obtained projected images with their corresponding saliency maps. After that, we discard the majority of the background information by extracting sub-salient images. The latter is introduced as a sequential input of the vision transformer in order to extract the global contextual information and to predict the quality scores of the sub-images. Finally, we average the scores of all the salient sub-images to obtain the perceptual 3DPC quality score. We evaluate the performance of our model on the ICIP2020 and SJTU point cloud quality assessment benchmarks. Experimental results show that our model achieves promising performance compared to the state-of-the-art point cloud quality assessment metrics.","PeriodicalId":296389,"journal":{"name":"Proceedings of the 2nd Workshop on Quality of Experience in Visual Multimedia Applications","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd Workshop on Quality of Experience in Visual Multimedia Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3552469.3555713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Quality estimation of 3D objects/scenes represented by cloud point is a crucial and challenging task in computer vision. In real-world applications, reference data is not always available, which motivates the development of new point cloud quality assessment (PCQA) metrics that do not require the original 3D point cloud (3DPC). This family of methods is called no-reference or blind PCQA. In this context, we propose a deep-learning-based approach that benefits from the advantage of the self-attention mechanism in transformers to accurately predict the perceptual quality score for each degraded 3DPC. Additionally, we introduce the use of saliency maps to reflect the human visual system behavior that is attracted to some specific regions compared to others during the evaluation. To this end, we first render 2D projections (i.e. views) of a 3DPC from different viewpoints. Then, we weight the obtained projected images with their corresponding saliency maps. After that, we discard the majority of the background information by extracting sub-salient images. The latter is introduced as a sequential input of the vision transformer in order to extract the global contextual information and to predict the quality scores of the sub-images. Finally, we average the scores of all the salient sub-images to obtain the perceptual 3DPC quality score. We evaluate the performance of our model on the ICIP2020 and SJTU point cloud quality assessment benchmarks. Experimental results show that our model achieves promising performance compared to the state-of-the-art point cloud quality assessment metrics.