Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463397
Chenyan Zhang, A. S. Hoel, A. Perkis, Saman Zadtootaghaj
In this paper, the immersiveness of three variations of spatial content was tested and compared. Content A is a high quality architectural visualization, which is characterized as purely spatial immersion. Content B is the best goal moments of real football games, which is mainly spatial immersion, with slight tactical immersion focus, compiled with mixed qualities. And content C is a high quality recorded virtual game animation, which is mainly spatial immersion, with slight emotional immersion focus. Each of the three spatial contents was cut into different lengths of 3 min, 7 min and 11 min. The participants report the ratings of their immersive experience on a 34-item questionnaire, after watching a combination of three media clips fully randomized in content types and durations, on a 10-inch tablet. Results show that overall, 7 min duration allows the users to feel significantly greater immersive experience than 3 min and 11 min durations for content A and content C. And for content B, 3 min duration stands out as the most immersive. Our study suggests that it is not the longer the more immersive, but there is an optimal duration for spatial immersion (around 7 min). After that, if there is not enough dramaturgical structure to sustain the audience interest, the immersiveness of the spatial content would significantly diminish (i.e. immersion turns into boredom). Our study also shows that realism factors also play a crucial role in inducing spatial immersion.
{"title":"How Long is Long Enough to Induce Immersion?","authors":"Chenyan Zhang, A. S. Hoel, A. Perkis, Saman Zadtootaghaj","doi":"10.1109/QoMEX.2018.8463397","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463397","url":null,"abstract":"In this paper, the immersiveness of three variations of spatial content was tested and compared. Content A is a high quality architectural visualization, which is characterized as purely spatial immersion. Content B is the best goal moments of real football games, which is mainly spatial immersion, with slight tactical immersion focus, compiled with mixed qualities. And content C is a high quality recorded virtual game animation, which is mainly spatial immersion, with slight emotional immersion focus. Each of the three spatial contents was cut into different lengths of 3 min, 7 min and 11 min. The participants report the ratings of their immersive experience on a 34-item questionnaire, after watching a combination of three media clips fully randomized in content types and durations, on a 10-inch tablet. Results show that overall, 7 min duration allows the users to feel significantly greater immersive experience than 3 min and 11 min durations for content A and content C. And for content B, 3 min duration stands out as the most immersive. Our study suggests that it is not the longer the more immersive, but there is an optimal duration for spatial immersion (around 7 min). After that, if there is not enough dramaturgical structure to sustain the audience interest, the immersiveness of the spatial content would significantly diminish (i.e. immersion turns into boredom). Our study also shows that realism factors also play a crucial role in inducing spatial immersion.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"43 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81767595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463423
Sajad Mowlaei, Steven Schmidt, Saman Zadtootaghaj, S. Möller
Recent advancements of network architecture such as 5G networks, promise cloud services with strict network constrains a bright future. Cloud gaming as an interactive service has strict end-to-end delay constraints. Therefore, many studies investigated the impact of network parameters such as delay or packet loss on gaming QoE. However, they mostly compared games or genres with each other and neglected the fact even two levels of the same game may have different sensitivity toward delay. In order to understand the game characteristics that cause this difference in delay sensitivity, a bottom-up approach by means of modifiable open source games can be of high value. In this paper we present a game designed to tackle this issue. The game allows to artificially change characteristics of the game, such as the pace and size of objects, and also simulate influences like delay, packet loss or a reduced frame rate. This allows the usage of the game also for crowdsourcing studies, where it is not possible to control the different network conditions of the participants, and to investigate the impact of spatial and temporal accuracy in respect to the sensitivity towards impairments.
{"title":"Know your Game: A Bottom-Up Approach for Gaming Research","authors":"Sajad Mowlaei, Steven Schmidt, Saman Zadtootaghaj, S. Möller","doi":"10.1109/QoMEX.2018.8463423","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463423","url":null,"abstract":"Recent advancements of network architecture such as 5G networks, promise cloud services with strict network constrains a bright future. Cloud gaming as an interactive service has strict end-to-end delay constraints. Therefore, many studies investigated the impact of network parameters such as delay or packet loss on gaming QoE. However, they mostly compared games or genres with each other and neglected the fact even two levels of the same game may have different sensitivity toward delay. In order to understand the game characteristics that cause this difference in delay sensitivity, a bottom-up approach by means of modifiable open source games can be of high value. In this paper we present a game designed to tackle this issue. The game allows to artificially change characteristics of the game, such as the pace and size of objects, and also simulate influences like delay, packet loss or a reduced frame rate. This allows the usage of the game also for crowdsourcing studies, where it is not possible to control the different network conditions of the participants, and to investigate the impact of spatial and temporal accuracy in respect to the sensitivity towards impairments.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"3 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79607525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A great need of High-Resolution (HR) images has boosted the development of interpolation techniques. However, it is still a challenging task to objectively evaluate the perceptual quality of interpolated images, especially when the interpolation factor is a non-integer. To address this issue, we propose a hybrid quality metric for non-integer image interpolation that combines both reduced-reference and no-reference philosophies. To validate the proposed metric, we construct a non-integer interpolated image database and conduct a subjective user study to collect subjective opinions for each image. Experiments on the new database show that the proposed metric outperforms previous methods by a large margin.
{"title":"A Hybrid Quality Metric for Non-Integer Image Interpolation","authors":"Jinling Chen, Yiwen Xu, Kede Ma, Huiwen Huang, Tiesong Zhao","doi":"10.1109/QoMEX.2018.8463405","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463405","url":null,"abstract":"A great need of High-Resolution (HR) images has boosted the development of interpolation techniques. However, it is still a challenging task to objectively evaluate the perceptual quality of interpolated images, especially when the interpolation factor is a non-integer. To address this issue, we propose a hybrid quality metric for non-integer image interpolation that combines both reduced-reference and no-reference philosophies. To validate the proposed metric, we construct a non-integer interpolated image database and conduct a subjective user study to collect subjective opinions for each image. Experiments on the new database show that the proposed metric outperforms previous methods by a large margin.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"15 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77217973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463368
Linlin Bie, Xu Wang, J. Korhonen
Consumer photos taken in low light conditions often suffer from substantial undesired capture artifacts, such as shakiness and sensor noise. In this paper, we use rank ordering method to assess the subjective preferences among different postprocessing methods used to alleviate capture artifacts. The results show that most users prefer sharpened photos, even in the presence of substantial sensor noise. However, there are also systematic differences in individual preferences between users. Therefore, user preferences need to be considered in addition to the image characteristics, when selecting the post-processing algorithms and parameters for photo quality enhancement.
{"title":"Subjective Assessment of Post-Processing Methods for Low Light Consumer Photos","authors":"Linlin Bie, Xu Wang, J. Korhonen","doi":"10.1109/QoMEX.2018.8463368","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463368","url":null,"abstract":"Consumer photos taken in low light conditions often suffer from substantial undesired capture artifacts, such as shakiness and sensor noise. In this paper, we use rank ordering method to assess the subjective preferences among different postprocessing methods used to alleviate capture artifacts. The results show that most users prefer sharpened photos, even in the presence of substantial sensor noise. However, there are also systematic differences in individual preferences between users. Therefore, user preferences need to be considered in addition to the image characteristics, when selecting the post-processing algorithms and parameters for photo quality enhancement.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"27 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78884943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463421
R. R. Tamboli, Balasubramanyam Appina, P. A. Kara, M. Martini, Sumohana S. Channappayya, S. Jana
Due to recent advent of light field visualization, ac-quisition/creation, encoding, transmission, rendering and quality assessment of 3D light field content has gained momentum. In particular, large light field displays need content with large field of view, and with high spatial and angular quality. Accordingly, subjective and objective quality evaluation studies have been conducted to examine spatial, angular and spatio-angular aspects of light field visualization. Recently, the effect of various zooming levels of the displayed content, as well as regions of interest on Quality of Experience (QoE) has also been explored. However, there has been no systematic attempt to see how the features of the content itself affect the visualization quality. In this work, we attempt to examine the effects of some primitive features of the content on subjective QoE. The results are based on a subjective study conducted on a large light field display, offering virtually continuous horizontal parallax.
{"title":"Effect of Primitive Features of Content on Perceived Quality of Light Field Visualization","authors":"R. R. Tamboli, Balasubramanyam Appina, P. A. Kara, M. Martini, Sumohana S. Channappayya, S. Jana","doi":"10.1109/QoMEX.2018.8463421","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463421","url":null,"abstract":"Due to recent advent of light field visualization, ac-quisition/creation, encoding, transmission, rendering and quality assessment of 3D light field content has gained momentum. In particular, large light field displays need content with large field of view, and with high spatial and angular quality. Accordingly, subjective and objective quality evaluation studies have been conducted to examine spatial, angular and spatio-angular aspects of light field visualization. Recently, the effect of various zooming levels of the displayed content, as well as regions of interest on Quality of Experience (QoE) has also been explored. However, there has been no systematic attempt to see how the features of the content itself affect the visualization quality. In this work, we attempt to examine the effects of some primitive features of the content on subjective QoE. The results are based on a subjective study conducted on a large light field display, offering virtually continuous horizontal parallax.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"11 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88820405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463396
Steve Goering, Konstantin Brand, A. Raake
Today several photo platforms provide thousands of new pictures, it becomes ambitious to find highly appealing or like-able photos within such loads of data. Here, automatic liking prediction can support users in handling their pictures or improve ranking in sharing platforms. We describe a machine learning approach for photo liking prediction. Our features are based on various techniques, e.g. natural language processing/sentiment analysis, pre-trained deep learning networks, social network analysis and extended previously reported features. We conduct large-scale experiments using a collected dataset consisting of 80k photos based on two main categories from 500px with different settings. In our experiments we analyzed the impact of our newly features and found that social network features have the strongest influence for liking prediction, we achived a boost of 15%. Furthermore, we show that all implemented features are able to improve prediction accuracy of liking rates. We additionally analyze which groups of features that can be derived directly from pictures are usable for prediction.
{"title":"Extended Features using Machine Learning Techniques for Photo Liking Prediction","authors":"Steve Goering, Konstantin Brand, A. Raake","doi":"10.1109/QoMEX.2018.8463396","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463396","url":null,"abstract":"Today several photo platforms provide thousands of new pictures, it becomes ambitious to find highly appealing or like-able photos within such loads of data. Here, automatic liking prediction can support users in handling their pictures or improve ranking in sharing platforms. We describe a machine learning approach for photo liking prediction. Our features are based on various techniques, e.g. natural language processing/sentiment analysis, pre-trained deep learning networks, social network analysis and extended previously reported features. We conduct large-scale experiments using a collected dataset consisting of 80k photos based on two main categories from 500px with different settings. In our experiments we analyzed the impact of our newly features and found that social network features have the strongest influence for liking prediction, we achived a boost of 15%. Furthermore, we show that all implemented features are able to improve prediction accuracy of liking rates. We additionally analyze which groups of features that can be derived directly from pictures are usable for prediction.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"27 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89510465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463392
Rishabh Gupta, Anderson R. Avila, T. Falk
Subjective evaluation of synthesized speech is not an easy task as various quality dimensions can be affected, including naturalness, prosody, pronunciation, and continuity, to name a few. Evaluations typically rely on naive listeners, thus more closely representing the consumers of commercial products. As such, while the results of these costly and time consuming tests may provide text-to-speech (TTS) system developers with feedback on the perceived quality and acceptability of their devices, it provides little information on what the source of the problems are and what can be done about it. In this paper, we propose the use of neuroimaging to probe the unconscious cognitive processing of naive listeners as they listen to synthesized speech generated by different systems of varying quality. The obtained neural insights have allowed us to extract a small subset of very relevant features from the speech signals and to use these features to build a simple, no-reference instrumental quality metric specifically tailored to TTS speech. The metric is tested on an unseen dataset and shown to significantly outperform a benchmark algorithm.
{"title":"Towards a Neuro-Inspired No-Reference Instrumental Quality Measure for Text-to-Speech Systems","authors":"Rishabh Gupta, Anderson R. Avila, T. Falk","doi":"10.1109/QoMEX.2018.8463392","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463392","url":null,"abstract":"Subjective evaluation of synthesized speech is not an easy task as various quality dimensions can be affected, including naturalness, prosody, pronunciation, and continuity, to name a few. Evaluations typically rely on naive listeners, thus more closely representing the consumers of commercial products. As such, while the results of these costly and time consuming tests may provide text-to-speech (TTS) system developers with feedback on the perceived quality and acceptability of their devices, it provides little information on what the source of the problems are and what can be done about it. In this paper, we propose the use of neuroimaging to probe the unconscious cognitive processing of naive listeners as they listen to synthesized speech generated by different systems of varying quality. The obtained neural insights have allowed us to extract a small subset of very relevant features from the speech signals and to use these features to build a simple, no-reference instrumental quality metric specifically tailored to TTS speech. The metric is tested on an unseen dataset and shown to significantly outperform a benchmark algorithm.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"45 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90747112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463388
Irene Viola, T. Ebrahimi
In the last years, light field imaging has experienced a surge of popularity among the scientific community for its capability of rendering the 3D world in a more immersive way. In particular, several compression algorithms have been proposed to efficiently reduce the amount of data generated in the acquisition process, and different methodologies have been designed to reliably evaluate the visual quality of compressed contents. In this paper we propose a dataset for visual quality assessment of light field images (VALID). The dataset contains five contents compressed at various bitrates, using both off-the-shelf solutions and state-of-the-art algorithms. Results of objective quality evaluation using popular image metrics are included, as well as annotated subjective scores using three different methodologies and two types of visualization setups. The proposed dataset will help develop new objective metrics to predict visual quality, design new subjective assessment methodologies and compare them to existing ones, as well as produce novel analysis approaches to interpret the results.
{"title":"VALID: Visual quality Assessment for Light field Images Dataset","authors":"Irene Viola, T. Ebrahimi","doi":"10.1109/QoMEX.2018.8463388","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463388","url":null,"abstract":"In the last years, light field imaging has experienced a surge of popularity among the scientific community for its capability of rendering the 3D world in a more immersive way. In particular, several compression algorithms have been proposed to efficiently reduce the amount of data generated in the acquisition process, and different methodologies have been designed to reliably evaluate the visual quality of compressed contents. In this paper we propose a dataset for visual quality assessment of light field images (VALID). The dataset contains five contents compressed at various bitrates, using both off-the-shelf solutions and state-of-the-art algorithms. Results of objective quality evaluation using popular image metrics are included, as well as annotated subjective scores using three different methodologies and two types of visualization setups. The proposed dataset will help develop new objective metrics to predict visual quality, design new subjective assessment methodologies and compare them to existing ones, as well as produce novel analysis approaches to interpret the results.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"34 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86088340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463369
Jesús Gutiérrez, Erwan J. David, A. Coutrot, Matthieu Perreira Da Silva, P. Callet
Virtual Reality (VR) provides the users with new immersive media experiences, offering the possibility to freely explore 360° content. Understanding these new exploration behaviors is crucial for the development of efficient techniques for processing, coding, delivering and rendering omnidirectional content to offer the highest possible Quality of Experience (QoE). Progress has already been made on visual attention (VA) modeling for 360° content. In this paper we briefly review the current status of research on this topic that led us to propose a benchmarking platform for evaluating and comparing the performance of models for saliency and scanpath prediction for 360° content. This paper introduces the ‘UN Salient360! benchmark” platform featuring a dataset, a toolbox and a framework for evaluation of different class of models. This online platform can be found in httns://salient360.ls2n.fr/.
{"title":"Introducing UN Salient360! Benchmark: A platform for evaluating visual attention models for 360° contents","authors":"Jesús Gutiérrez, Erwan J. David, A. Coutrot, Matthieu Perreira Da Silva, P. Callet","doi":"10.1109/QoMEX.2018.8463369","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463369","url":null,"abstract":"Virtual Reality (VR) provides the users with new immersive media experiences, offering the possibility to freely explore 360° content. Understanding these new exploration behaviors is crucial for the development of efficient techniques for processing, coding, delivering and rendering omnidirectional content to offer the highest possible Quality of Experience (QoE). Progress has already been made on visual attention (VA) modeling for 360° content. In this paper we briefly review the current status of research on this topic that led us to propose a benchmarking platform for evaluating and comparing the performance of models for saliency and scanpath prediction for 360° content. This paper introduces the ‘UN Salient360! benchmark” platform featuring a dataset, a toolbox and a framework for evaluation of different class of models. This online platform can be found in httns://salient360.ls2n.fr/.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"15 1","pages":"1-3"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84286174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-01DOI: 10.1109/QoMEX.2018.8463384
Oliver Wiedemann, Vlad Hosu, Hanhe Lin, D. Saupe
Image quality has been studied almost exclusively as a global image property. It is common practice for IQA databases and metrics to quantify this abstract concept with a single number per image. We propose an approach to blind IQA based on a convolutional neural network (patchnet) that was trained on a novel set of 32,000 individually annotated patches of 64×64 pixel. We use this model to generate spatially small local quality maps of images taken from KonIQ-10k, a large and diverse in-the-wild database of authentically distorted images. We show that our local quality indicator correlates well with global MOS, going beyond the predictive ability of quality related attributes such as sharpness. Averaging of patchnet predictions already outperforms classical approaches to global MOS prediction that were trained to include global image features. We additionally experiment with a generic second-stage aggregation CNN to estimate mean opinion scores. Our latter model performs comparable to the state of the art with a PLCC of 0.81 on KonIQ-10k.
{"title":"Disregarding the Big Picture: Towards Local Image Quality Assessment","authors":"Oliver Wiedemann, Vlad Hosu, Hanhe Lin, D. Saupe","doi":"10.1109/QoMEX.2018.8463384","DOIUrl":"https://doi.org/10.1109/QoMEX.2018.8463384","url":null,"abstract":"Image quality has been studied almost exclusively as a global image property. It is common practice for IQA databases and metrics to quantify this abstract concept with a single number per image. We propose an approach to blind IQA based on a convolutional neural network (patchnet) that was trained on a novel set of 32,000 individually annotated patches of 64×64 pixel. We use this model to generate spatially small local quality maps of images taken from KonIQ-10k, a large and diverse in-the-wild database of authentically distorted images. We show that our local quality indicator correlates well with global MOS, going beyond the predictive ability of quality related attributes such as sharpness. Averaging of patchnet predictions already outperforms classical approaches to global MOS prediction that were trained to include global image features. We additionally experiment with a generic second-stage aggregation CNN to estimate mean opinion scores. Our latter model performs comparable to the state of the art with a PLCC of 0.81 on KonIQ-10k.","PeriodicalId":6618,"journal":{"name":"2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX)","volume":"61 1","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81296569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}