首页 > 最新文献

2011 IEEE 13th International Workshop on Multimedia Signal Processing最新文献

英文 中文
Color filter array demosaicking using optimized edge direction map 使用优化的边缘方向图的彩色滤波器阵列去马赛克
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093801
Seyun Kim, N. Cho
This paper proposes a new color filter array de-mosaicking method with emphasis on the edge estimation. In many existing approaches, the demosaicking is considered a directional interpolation problem, and thus finding the correct edge direction is a very important factor. However, these methods sometimes fail to determine an accurate interpolating direction because they use local information from neighboring pixels. For the estimation of edge direction using global information, we employ an MRF framework where the energy function is formulated by defining new notions of interpolation risk and pixel connectivity. Minimizing this function gives the edge directions, and the green channel is interpolated along the edges. Then we iterate the luminance update and color correction using the high frequencies from green channel. The algorithm is tested with the commonly used images, and it is shown to yield higher CPSNR than the state-of-the-art methods in many images, up to 2.7dB at maximum and 0.4dB on average. Subjective comparison also shows that the proposed method produces less artifacts on complex structures.
提出了一种新的彩色滤波阵列去拼接方法,并着重于边缘估计。在现有的许多方法中,去马赛克被认为是一个方向插值问题,因此找到正确的边缘方向是一个非常重要的因素。然而,这些方法有时无法确定准确的插值方向,因为它们使用了邻近像素的局部信息。为了使用全局信息估计边缘方向,我们采用了一个MRF框架,其中能量函数通过定义插值风险和像素连通性的新概念来制定。最小化这个函数给出边缘方向,绿色通道沿着边缘插值。然后利用来自绿色通道的高频迭代亮度更新和颜色校正。该算法在常用图像上进行了测试,结果表明,在许多图像中,该算法比最先进的方法产生更高的CPSNR,最高可达2.7dB,平均可达0.4dB。主观对比也表明,该方法对复杂结构产生的伪影较少。
{"title":"Color filter array demosaicking using optimized edge direction map","authors":"Seyun Kim, N. Cho","doi":"10.1109/MMSP.2011.6093801","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093801","url":null,"abstract":"This paper proposes a new color filter array de-mosaicking method with emphasis on the edge estimation. In many existing approaches, the demosaicking is considered a directional interpolation problem, and thus finding the correct edge direction is a very important factor. However, these methods sometimes fail to determine an accurate interpolating direction because they use local information from neighboring pixels. For the estimation of edge direction using global information, we employ an MRF framework where the energy function is formulated by defining new notions of interpolation risk and pixel connectivity. Minimizing this function gives the edge directions, and the green channel is interpolated along the edges. Then we iterate the luminance update and color correction using the high frequencies from green channel. The algorithm is tested with the commonly used images, and it is shown to yield higher CPSNR than the state-of-the-art methods in many images, up to 2.7dB at maximum and 0.4dB on average. Subjective comparison also shows that the proposed method produces less artifacts on complex structures.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"9 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133071898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive noise model for transform domain Wyner-Ziv video using clustering of DCT blocks 基于DCT块聚类的变换域Wyner-Ziv视频自适应噪声模型
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093774
Huynh Van Luong, Xin Huang, Søren Forchhammer
The noise model is one of the most important aspects influencing the coding performance of Distributed Video Coding. This paper proposes a novel noise model for Transform Domain Wyner-Ziv (TDWZ) video coding by using clustering of DCT blocks. The clustering algorithm takes advantage of the residual information of all frequency bands, iteratively classifies blocks into different categories and estimates the noise parameter in each category. The experimental results show that the coding performance of the proposed cluster level noise model is competitive with state-of-the-art coefficient level noise modelling. Furthermore, the proposed cluster level noise model is adaptively combined with a coefficient level noise model in this paper to robustly improve coding performance of TDWZ video codec up to 1.24 dB (by Bj⊘ntegaard metric) compared to the DISCOVER TDWZ video codec.
噪声模型是影响分布式视频编码性能的重要方面之一。本文提出了一种基于DCT块聚类的变换域Wyner-Ziv (TDWZ)视频编码噪声模型。聚类算法利用各频带的残差信息,迭代地将块划分为不同的类别,并估计每个类别中的噪声参数。实验结果表明,所提出的聚类水平噪声模型的编码性能优于目前最先进的系数水平噪声模型。此外,本文提出的聚类级噪声模型自适应地与系数级噪声模型相结合,使TDWZ视频编解码器的编码性能比DISCOVER TDWZ视频编解码器提高了1.24 dB(以Bj⊘整数度量)。
{"title":"Adaptive noise model for transform domain Wyner-Ziv video using clustering of DCT blocks","authors":"Huynh Van Luong, Xin Huang, Søren Forchhammer","doi":"10.1109/MMSP.2011.6093774","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093774","url":null,"abstract":"The noise model is one of the most important aspects influencing the coding performance of Distributed Video Coding. This paper proposes a novel noise model for Transform Domain Wyner-Ziv (TDWZ) video coding by using clustering of DCT blocks. The clustering algorithm takes advantage of the residual information of all frequency bands, iteratively classifies blocks into different categories and estimates the noise parameter in each category. The experimental results show that the coding performance of the proposed cluster level noise model is competitive with state-of-the-art coefficient level noise modelling. Furthermore, the proposed cluster level noise model is adaptively combined with a coefficient level noise model in this paper to robustly improve coding performance of TDWZ video codec up to 1.24 dB (by Bj⊘ntegaard metric) compared to the DISCOVER TDWZ video codec.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115768281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Interpolation of combined head and room impulse response for audio spatialization 用于音频空间化的头部和房间脉冲响应的插值
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093794
S. Mehrotra, Weig-Ge Chen, Zhengyou Zhang
Audio spatialization is becoming an important part of creating realistic experiences needed for immersive video conferencing and gaming. Using a combined head and room impulse response (CHRIR) has been recently proposed as an alternative to using separate head related transfer functions (HRTF) and room impulse responses (RIR). Accurate measurements of the CHRIR at various source and listener locations and orientations are needed to perform good quality audio spatialization. However, it is infeasible to accurately measure or model the CHRIR for all possible locations and orientations. Therefore, low-complexity and accurate interpolation techniques are needed to perform audio spatialization in real-time. In this paper, we present a frequency domain interpolation technique which naturally interpolates the interaural level difference (ILD) and interaural time difference (ITD) for each frequency component in the spectrum. The proposed technique allows for an accurate and low-complexity interpolation of the CHRIR as well as allowing for a low-complexity audio spatialization technique which can be used for both headphones as well as loudspeakers.
音频空间化正在成为创造沉浸式视频会议和游戏所需的逼真体验的重要组成部分。最近提出使用头部和房间脉冲响应(CHRIR)作为使用单独的头部相关传递函数(HRTF)和房间脉冲响应(RIR)的替代方案。为了实现高质量的音频空间化,需要在不同的源和听者位置和方向上精确测量CHRIR。然而,在所有可能的位置和方向上精确测量或建模CHRIR是不可行的。因此,需要低复杂度和精确的插值技术来实现实时的音频空间化。在本文中,我们提出了一种频域插值技术,该技术可以自然地对频谱中每个频率分量的耳间电平差(ILD)和耳间时差(ITD)进行插值。所提出的技术允许精确和低复杂度的插值,以及允许低复杂度的音频空间化技术,可用于耳机和扬声器。
{"title":"Interpolation of combined head and room impulse response for audio spatialization","authors":"S. Mehrotra, Weig-Ge Chen, Zhengyou Zhang","doi":"10.1109/MMSP.2011.6093794","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093794","url":null,"abstract":"Audio spatialization is becoming an important part of creating realistic experiences needed for immersive video conferencing and gaming. Using a combined head and room impulse response (CHRIR) has been recently proposed as an alternative to using separate head related transfer functions (HRTF) and room impulse responses (RIR). Accurate measurements of the CHRIR at various source and listener locations and orientations are needed to perform good quality audio spatialization. However, it is infeasible to accurately measure or model the CHRIR for all possible locations and orientations. Therefore, low-complexity and accurate interpolation techniques are needed to perform audio spatialization in real-time. In this paper, we present a frequency domain interpolation technique which naturally interpolates the interaural level difference (ILD) and interaural time difference (ITD) for each frequency component in the spectrum. The proposed technique allows for an accurate and low-complexity interpolation of the CHRIR as well as allowing for a low-complexity audio spatialization technique which can be used for both headphones as well as loudspeakers.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"331 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116235239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Augmented LDPC graph for distributed video coding with multiple side information 多侧信息分布式视频编码的增强型LDPC图
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093788
J. Ascenso, Catarina Brites, F. Pereira
The advances made in channel-capacity codes, such as turbo codes and low-density parity-check (LDPC) codes, have played a major role in the emerging distributed source coding paradigm. LDPC codes can be easily adapted to new source coding strategies due to their natural representation as bipartite graphs and the use of quasi-optimal decoding algorithms, such as belief propagation. This paper tackles a relevant scenario in distributed video coding: lossy source coding when multiple side information (SI) hypotheses are available at the decoder, each one correlated with the source according to different correlation noise channels. Thus, it is proposed to exploit multiple SI hypotheses through an efficient joint decoding technique with multiple LDPC syndrome decoders that exchange information to obtain coding efficiency improvements. At the decoder side, the multiple SI hypotheses are created with motion compensated frame interpolation and fused together in a novel iterative LDPC based Slepian-Wolf decoding algorithm. With the creation of multiple SI hypotheses and the proposed decoding algorithm, bitrate savings up to 8.0% are obtained for similar decoded quality.
信道容量码(如turbo码和低密度奇偶校验码)的进步在新兴的分布式源编码范式中发挥了重要作用。LDPC码可以很容易地适应新的源编码策略,因为它们的自然表示为二部图和使用准最优解码算法,如信念传播。本文研究了分布式视频编码中的一个相关场景:当解码器有多个侧信息(SI)假设,每个侧信息假设根据不同的相关噪声信道与源相关时,有损源编码。因此,本文提出利用多个LDPC综合征解码器交换信息的高效联合解码技术,利用多个SI假设来提高编码效率。在解码器端,通过运动补偿帧插值创建多个SI假设,并在基于LDPC的新颖迭代Slepian-Wolf解码算法中融合在一起。通过创建多个SI假设和所提出的解码算法,可以在相似的解码质量下获得高达8.0%的比特率节省。
{"title":"Augmented LDPC graph for distributed video coding with multiple side information","authors":"J. Ascenso, Catarina Brites, F. Pereira","doi":"10.1109/MMSP.2011.6093788","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093788","url":null,"abstract":"The advances made in channel-capacity codes, such as turbo codes and low-density parity-check (LDPC) codes, have played a major role in the emerging distributed source coding paradigm. LDPC codes can be easily adapted to new source coding strategies due to their natural representation as bipartite graphs and the use of quasi-optimal decoding algorithms, such as belief propagation. This paper tackles a relevant scenario in distributed video coding: lossy source coding when multiple side information (SI) hypotheses are available at the decoder, each one correlated with the source according to different correlation noise channels. Thus, it is proposed to exploit multiple SI hypotheses through an efficient joint decoding technique with multiple LDPC syndrome decoders that exchange information to obtain coding efficiency improvements. At the decoder side, the multiple SI hypotheses are created with motion compensated frame interpolation and fused together in a novel iterative LDPC based Slepian-Wolf decoding algorithm. With the creation of multiple SI hypotheses and the proposed decoding algorithm, bitrate savings up to 8.0% are obtained for similar decoded quality.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121038962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A pipelined architecture for 4×4 intra frame mode decision in the high efficiency video coding 一种用于4×4高效视频编码帧内模式判定的流水线结构
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093851
Fu Li, Guangming Shi
Mode decision in High Efficient Video Coding (HEVC) is occupied more than half of the computational complexity in intra frame coding. Block size of 4×4 is the most frequently used block in HM. In this paper, we proposed a pipelined architecture for the 4×4 intra frame mode decision in HEVC to improve the computational capability. This novel architecture consists of six-stage pipelines, and each of the pipelines can be accomplished within 24 clock cycles. In the pipeline of prediction procedure, we proposed a folded project-skip architecture for prediction. It can save the processing latency and the registers considerably. We also proposed a simplified CAVLC with low complexity in the pipeline of bits estimation procedure. The architecture for mode decision has been evaluated with TSMC 0.13μm CMOS technology. Synthesized results show that the proposed architecture only needs 99K logic gates for modes decision and can run at 165 MHz operation frequency.
高效视频编码(HEVC)中的模式决策占帧内编码计算复杂度的一半以上。块大小4×4是HM中最常用的块。在本文中,我们提出了一种流水线架构,用于HEVC中4×4帧内模式的决策,以提高计算能力。这种新架构由6级管道组成,每个管道可以在24个时钟周期内完成。在预测流程中,我们提出了一种折叠的项目跳过结构进行预测。它可以大大节省处理延迟和寄存器。我们还提出了一种简化的CAVLC,在比特估计过程的管道中具有低复杂度。采用台积电0.13μm CMOS技术对模式决策架构进行了评估。综合结果表明,该架构仅需99K逻辑门即可进行模式判定,且可在165 MHz工作频率下工作。
{"title":"A pipelined architecture for 4×4 intra frame mode decision in the high efficiency video coding","authors":"Fu Li, Guangming Shi","doi":"10.1109/MMSP.2011.6093851","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093851","url":null,"abstract":"Mode decision in High Efficient Video Coding (HEVC) is occupied more than half of the computational complexity in intra frame coding. Block size of 4×4 is the most frequently used block in HM. In this paper, we proposed a pipelined architecture for the 4×4 intra frame mode decision in HEVC to improve the computational capability. This novel architecture consists of six-stage pipelines, and each of the pipelines can be accomplished within 24 clock cycles. In the pipeline of prediction procedure, we proposed a folded project-skip architecture for prediction. It can save the processing latency and the registers considerably. We also proposed a simplified CAVLC with low complexity in the pipeline of bits estimation procedure. The architecture for mode decision has been evaluated with TSMC 0.13μm CMOS technology. Synthesized results show that the proposed architecture only needs 99K logic gates for modes decision and can run at 165 MHz operation frequency.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123379236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Cooperative multi-object tracking method for Wireless Video Sensor Networks 无线视频传感器网络协同多目标跟踪方法
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093796
Zheng Chu, L. Zhuo, Yingdi Zhao, Xiaoguang Li
For the enormous number and the limited energy of network nodes in the wireless video sensor networks (WVSN) environment, to fulfil the complicated tasks, multiple sensor nodes should collaborate with each other. A cooperative multi-object tracking method for Wireless Video Sensor Networks is proposed in this paper. The proposed method is focused on the solution of cooperative multi-object tracking among multiple sensor nodes when an object leaves the view field of the tracking node. The main contributions of our proposed method are that: (1) the sensing model of a video sensor and Kalman filter is utilized to achieve optimal sensor selection. (2) Projective Invariants are employed to integrate information from the related nodes. The experimental results show that the proposed method is effective for resolving the problem of tracking relay.
在无线视频传感器网络(WVSN)环境中,由于网络节点数量庞大、能量有限,需要多个传感器节点相互协作才能完成复杂的任务。提出了一种用于无线视频传感器网络的协同多目标跟踪方法。该方法重点解决了当目标离开跟踪节点的视域时,多个传感器节点之间的协同多目标跟踪问题。该方法的主要贡献在于:(1)利用视频传感器的感知模型和卡尔曼滤波来实现传感器的最优选择。(2)利用射影不变量对相关节点的信息进行整合。实验结果表明,该方法可以有效地解决跟踪继电器的问题。
{"title":"Cooperative multi-object tracking method for Wireless Video Sensor Networks","authors":"Zheng Chu, L. Zhuo, Yingdi Zhao, Xiaoguang Li","doi":"10.1109/MMSP.2011.6093796","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093796","url":null,"abstract":"For the enormous number and the limited energy of network nodes in the wireless video sensor networks (WVSN) environment, to fulfil the complicated tasks, multiple sensor nodes should collaborate with each other. A cooperative multi-object tracking method for Wireless Video Sensor Networks is proposed in this paper. The proposed method is focused on the solution of cooperative multi-object tracking among multiple sensor nodes when an object leaves the view field of the tracking node. The main contributions of our proposed method are that: (1) the sensing model of a video sensor and Kalman filter is utilized to achieve optimal sensor selection. (2) Projective Invariants are employed to integrate information from the related nodes. The experimental results show that the proposed method is effective for resolving the problem of tracking relay.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"51 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126125247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A psychovisually tuned image codec 一种心理视觉调谐图像编解码器
Pub Date : 2011-12-01 DOI: 10.1109/MMSP.2011.6093772
Guangtao Zhai, Xiaolin Wu, Yi Niu
A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.
本文提出了一种利用视觉感知的心理和神经过程的心理视觉质量驱动的图像编解码器。脑理论和神经科学的最新研究表明,视觉感知是大脑内部生成模型适应外部视网膜刺激的过程。而心理视觉质量则与内部生成模型解释视觉感官数据的准确度有关。因此,我们的视觉心理调整图像压缩系统的设计准则是在编码端从输入图像中找到最优生成模型的紧凑描述,然后使用该模型在解码端重新生成输出图像。注意到经验自然图像统计的一个重要发现,即自然图像在像素的高阶统计量中具有尺度不变特征,通过在编码器上进行模型保持空间下采样,可以有效地压缩生成模型。解码器可以通过模型保持上采样模块反转该过程以生成解码图像。所提出的系统是完全标准的投诉,因为下采样图像可以用任何现有的编解码器压缩(在这项工作中是JPEG2000)。在较宽的比特率范围内,该算法的主观和客观质量均优于JPEG2000。
{"title":"A psychovisually tuned image codec","authors":"Guangtao Zhai, Xiaolin Wu, Yi Niu","doi":"10.1109/MMSP.2011.6093772","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093772","url":null,"abstract":"A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126153879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Local semi-supervised regression for single-image super-resolution 单图像超分辨率局部半监督回归
Pub Date : 2011-10-01 DOI: 10.1109/MMSP.2011.6093842
Yilong Tang, Xiaoli Pan, Yuan Yuan, Pingkun Yan, Luoqing Li, Xuelong Li
In this paper, we propose a local semi-supervised learning-based algorithm for single-image super-resolution. Different from most of example-based algorithms, the information of test patches is considered during learning local regression functions which map a low-resolution patch to a high-resolution patch. Localization strategy is generally adopted in single-image super-resolution with nearest neighbor-based algorithms. However, the poor generalization of the nearest neighbor estimation decreases the performance of such algorithms. Though the problem can be fixed by local regression algorithms, the sizes of local training sets are always too small to improve the performance of nearest neighbor-based algorithms significantly. To overcome the difficulty, the semi-supervised regression algorithm is used here. Unlike supervised regression, the information about test samples is considered in semi-supervised regression algorithms, which makes the semi-supervised regression more powerful. Noticing that numerous test patches exist, the performance of nearest neighbor-based algorithms can be further improved by employing a semi-supervised regression algorithm. Experiments verify the effectiveness of the proposed algorithm.
本文提出了一种基于局部半监督学习的单幅图像超分辨算法。与大多数基于示例的算法不同,在学习局部回归函数时考虑了测试补丁的信息,将低分辨率补丁映射到高分辨率补丁。基于最近邻的单图像超分辨算法一般采用定位策略。然而,最近邻估计的泛化性较差,降低了这类算法的性能。虽然局部回归算法可以解决这个问题,但局部训练集的大小总是太小,无法显著提高基于最近邻的算法的性能。为了克服这一困难,本文采用了半监督回归算法。与监督回归不同,半监督回归算法考虑了测试样本的信息,这使得半监督回归更加强大。注意到存在大量的测试补丁,基于最近邻的算法的性能可以通过采用半监督回归算法进一步提高。实验验证了该算法的有效性。
{"title":"Local semi-supervised regression for single-image super-resolution","authors":"Yilong Tang, Xiaoli Pan, Yuan Yuan, Pingkun Yan, Luoqing Li, Xuelong Li","doi":"10.1109/MMSP.2011.6093842","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093842","url":null,"abstract":"In this paper, we propose a local semi-supervised learning-based algorithm for single-image super-resolution. Different from most of example-based algorithms, the information of test patches is considered during learning local regression functions which map a low-resolution patch to a high-resolution patch. Localization strategy is generally adopted in single-image super-resolution with nearest neighbor-based algorithms. However, the poor generalization of the nearest neighbor estimation decreases the performance of such algorithms. Though the problem can be fixed by local regression algorithms, the sizes of local training sets are always too small to improve the performance of nearest neighbor-based algorithms significantly. To overcome the difficulty, the semi-supervised regression algorithm is used here. Unlike supervised regression, the information about test samples is considered in semi-supervised regression algorithms, which makes the semi-supervised regression more powerful. Noticing that numerous test patches exist, the performance of nearest neighbor-based algorithms can be further improved by employing a semi-supervised regression algorithm. Experiments verify the effectiveness of the proposed algorithm.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132086984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Exploring personal aspects using eye-tracking modality in Tetris-playing 在玩《俄罗斯方块》时使用眼动追踪模式探索个人方面
Pub Date : 2011-10-01 DOI: 10.1109/MMSP.2011.6093841
Weifeng Li, Marc-Antoine Nüssli, Patrick Jermann
This paper exploits the personal aspects of an individual's eye-movements in dynamic Tetris-playing environments. Effective features representing the players' eye-moving characteristics are extracted, and they are shown to be different across difference players. Delta features are also calculated to present the dynamic changes of the static features. A series of personal identification experiments are performed by using a hidden Markov models (HMM). Our experimental results show that compared with local information, modeling and tracking the dynamic temporal information (i.e., delta features) is of more importance in distinguishing different players' eye-movement. Given a 10-zoid consecutive playing signals (about 30 seconds) we can achieve an identification rate of 82.1% by combining them both.
本文利用了动态《俄罗斯方块》游戏环境中个体眼球运动的个人方面。提取了代表球员眼球运动特征的有效特征,这些特征在不同的球员身上表现出不同的特征。还计算了Delta特征来表示静态特征的动态变化。利用隐马尔可夫模型(HMM)进行了一系列的个人识别实验。实验结果表明,与局部信息相比,动态时间信息(即delta特征)的建模和跟踪对于区分不同玩家的眼球运动更为重要。给定10个连续播放信号(约30秒),将两者结合可以达到82.1%的识别率。
{"title":"Exploring personal aspects using eye-tracking modality in Tetris-playing","authors":"Weifeng Li, Marc-Antoine Nüssli, Patrick Jermann","doi":"10.1109/MMSP.2011.6093841","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093841","url":null,"abstract":"This paper exploits the personal aspects of an individual's eye-movements in dynamic Tetris-playing environments. Effective features representing the players' eye-moving characteristics are extracted, and they are shown to be different across difference players. Delta features are also calculated to present the dynamic changes of the static features. A series of personal identification experiments are performed by using a hidden Markov models (HMM). Our experimental results show that compared with local information, modeling and tracking the dynamic temporal information (i.e., delta features) is of more importance in distinguishing different players' eye-movement. Given a 10-zoid consecutive playing signals (about 30 seconds) we can achieve an identification rate of 82.1% by combining them both.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116153806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient image transmission through analog error correction 有效的图像传输通过模拟误差校正
Pub Date : 2011-05-08 DOI: 10.1109/MMSP.2011.6093827
Yang Liu, T. Li, Kai Xie
This paper presents a new paradigm for image transmission through analog error correction codes. Conventional schemes rely on digitizing images through quantization (which inevitably causes significant bandwidth expansion) and transmitting binary bit-streams through digital error correction codes (which do not automatically differentiate the different levels of significance among the bits). To strike a better overall performance in terms of transmission efficiency and quality, we propose to use a single analog error correction code in lieu of digital quantization, digital code and digital modulation. The key is to get analog coding right. We show that this can be achieved by cleverly exploiting an elegant “butterfly” property of chaotic systems. Specifically, we demonstrate a tail-biting triple-branch baker's map code and its maximum-likelihood decoding algorithm. Simulations show that the proposed analog code can actually outperform digital turbo code, one of the best codes known to date! The results and findings discussed in this paper speak volume for the promising potential of analog codes, in spite of their rather short history.
本文提出了一种通过模拟纠错码进行图像传输的新范式。传统的方案依赖于通过量化(这不可避免地会导致显著的带宽扩展)将图像数字化,并通过数字纠错码(不能自动区分比特之间的不同显著性水平)传输二进制比特流。为了在传输效率和质量方面取得更好的综合性能,我们建议使用单一的模拟纠错码来代替数字量化、数字编码和数字调制。关键是要正确地进行模拟编码。我们表明,这可以通过巧妙地利用混沌系统的优雅的“蝴蝶”性质来实现。具体来说,我们展示了一个咬尾三分支面包师地图代码及其最大似然解码算法。仿真表明,所提出的模拟代码实际上可以胜过数字涡轮码,这是迄今为止已知的最好的代码之一!本文讨论的结果和发现充分说明了模拟码的潜力,尽管它们的历史相当短。
{"title":"Efficient image transmission through analog error correction","authors":"Yang Liu, T. Li, Kai Xie","doi":"10.1109/MMSP.2011.6093827","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093827","url":null,"abstract":"This paper presents a new paradigm for image transmission through analog error correction codes. Conventional schemes rely on digitizing images through quantization (which inevitably causes significant bandwidth expansion) and transmitting binary bit-streams through digital error correction codes (which do not automatically differentiate the different levels of significance among the bits). To strike a better overall performance in terms of transmission efficiency and quality, we propose to use a single analog error correction code in lieu of digital quantization, digital code and digital modulation. The key is to get analog coding right. We show that this can be achieved by cleverly exploiting an elegant “butterfly” property of chaotic systems. Specifically, we demonstrate a tail-biting triple-branch baker's map code and its maximum-likelihood decoding algorithm. Simulations show that the proposed analog code can actually outperform digital turbo code, one of the best codes known to date! The results and findings discussed in this paper speak volume for the promising potential of analog codes, in spite of their rather short history.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122675798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
2011 IEEE 13th International Workshop on Multimedia Signal Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1