一种心理视觉调谐图像编解码器

2011 IEEE 13th International Workshop on Multimedia Signal Processing Pub Date : 2011-12-01 DOI:10.1109/MMSP.2011.6093772

Guangtao Zhai, Xiaolin Wu, Yi Niu

{"title":"一种心理视觉调谐图像编解码器","authors":"Guangtao Zhai, Xiaolin Wu, Yi Niu","doi":"10.1109/MMSP.2011.6093772","DOIUrl":null,"url":null,"abstract":"A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A psychovisually tuned image codec\",\"authors\":\"Guangtao Zhai, Xiaolin Wu, Yi Niu\",\"doi\":\"10.1109/MMSP.2011.6093772\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.\",\"PeriodicalId\":214459,\"journal\":{\"name\":\"2011 IEEE 13th International Workshop on Multimedia Signal Processing\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 13th International Workshop on Multimedia Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2011.6093772\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2011.6093772","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

本文提出了一种利用视觉感知的心理和神经过程的心理视觉质量驱动的图像编解码器。脑理论和神经科学的最新研究表明，视觉感知是大脑内部生成模型适应外部视网膜刺激的过程。而心理视觉质量则与内部生成模型解释视觉感官数据的准确度有关。因此，我们的视觉心理调整图像压缩系统的设计准则是在编码端从输入图像中找到最优生成模型的紧凑描述，然后使用该模型在解码端重新生成输出图像。注意到经验自然图像统计的一个重要发现，即自然图像在像素的高阶统计量中具有尺度不变特征，通过在编码器上进行模型保持空间下采样，可以有效地压缩生成模型。解码器可以通过模型保持上采样模块反转该过程以生成解码图像。所提出的系统是完全标准的投诉，因为下采样图像可以用任何现有的编解码器压缩(在这项工作中是JPEG2000)。在较宽的比特率范围内，该算法的主观和客观质量均优于JPEG2000。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A psychovisually tuned image codec

A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE 13th International Workshop on Multimedia Signal Processing

自引率

0.00%

发文量

期刊最新文献

Separation of speech sources using an Acoustic Vector Sensor Strategies for orca call retrieval to support collaborative annotation of a large archive Recognizing actions using salient features Region of interest determination using human computation Image super-resolution via feature-based affine transform