Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093801
Seyun Kim, N. Cho
This paper proposes a new color filter array de-mosaicking method with emphasis on the edge estimation. In many existing approaches, the demosaicking is considered a directional interpolation problem, and thus finding the correct edge direction is a very important factor. However, these methods sometimes fail to determine an accurate interpolating direction because they use local information from neighboring pixels. For the estimation of edge direction using global information, we employ an MRF framework where the energy function is formulated by defining new notions of interpolation risk and pixel connectivity. Minimizing this function gives the edge directions, and the green channel is interpolated along the edges. Then we iterate the luminance update and color correction using the high frequencies from green channel. The algorithm is tested with the commonly used images, and it is shown to yield higher CPSNR than the state-of-the-art methods in many images, up to 2.7dB at maximum and 0.4dB on average. Subjective comparison also shows that the proposed method produces less artifacts on complex structures.
{"title":"Color filter array demosaicking using optimized edge direction map","authors":"Seyun Kim, N. Cho","doi":"10.1109/MMSP.2011.6093801","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093801","url":null,"abstract":"This paper proposes a new color filter array de-mosaicking method with emphasis on the edge estimation. In many existing approaches, the demosaicking is considered a directional interpolation problem, and thus finding the correct edge direction is a very important factor. However, these methods sometimes fail to determine an accurate interpolating direction because they use local information from neighboring pixels. For the estimation of edge direction using global information, we employ an MRF framework where the energy function is formulated by defining new notions of interpolation risk and pixel connectivity. Minimizing this function gives the edge directions, and the green channel is interpolated along the edges. Then we iterate the luminance update and color correction using the high frequencies from green channel. The algorithm is tested with the commonly used images, and it is shown to yield higher CPSNR than the state-of-the-art methods in many images, up to 2.7dB at maximum and 0.4dB on average. Subjective comparison also shows that the proposed method produces less artifacts on complex structures.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"9 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133071898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093774
Huynh Van Luong, Xin Huang, Søren Forchhammer
The noise model is one of the most important aspects influencing the coding performance of Distributed Video Coding. This paper proposes a novel noise model for Transform Domain Wyner-Ziv (TDWZ) video coding by using clustering of DCT blocks. The clustering algorithm takes advantage of the residual information of all frequency bands, iteratively classifies blocks into different categories and estimates the noise parameter in each category. The experimental results show that the coding performance of the proposed cluster level noise model is competitive with state-of-the-art coefficient level noise modelling. Furthermore, the proposed cluster level noise model is adaptively combined with a coefficient level noise model in this paper to robustly improve coding performance of TDWZ video codec up to 1.24 dB (by Bj⊘ntegaard metric) compared to the DISCOVER TDWZ video codec.
{"title":"Adaptive noise model for transform domain Wyner-Ziv video using clustering of DCT blocks","authors":"Huynh Van Luong, Xin Huang, Søren Forchhammer","doi":"10.1109/MMSP.2011.6093774","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093774","url":null,"abstract":"The noise model is one of the most important aspects influencing the coding performance of Distributed Video Coding. This paper proposes a novel noise model for Transform Domain Wyner-Ziv (TDWZ) video coding by using clustering of DCT blocks. The clustering algorithm takes advantage of the residual information of all frequency bands, iteratively classifies blocks into different categories and estimates the noise parameter in each category. The experimental results show that the coding performance of the proposed cluster level noise model is competitive with state-of-the-art coefficient level noise modelling. Furthermore, the proposed cluster level noise model is adaptively combined with a coefficient level noise model in this paper to robustly improve coding performance of TDWZ video codec up to 1.24 dB (by Bj⊘ntegaard metric) compared to the DISCOVER TDWZ video codec.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115768281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093794
S. Mehrotra, Weig-Ge Chen, Zhengyou Zhang
Audio spatialization is becoming an important part of creating realistic experiences needed for immersive video conferencing and gaming. Using a combined head and room impulse response (CHRIR) has been recently proposed as an alternative to using separate head related transfer functions (HRTF) and room impulse responses (RIR). Accurate measurements of the CHRIR at various source and listener locations and orientations are needed to perform good quality audio spatialization. However, it is infeasible to accurately measure or model the CHRIR for all possible locations and orientations. Therefore, low-complexity and accurate interpolation techniques are needed to perform audio spatialization in real-time. In this paper, we present a frequency domain interpolation technique which naturally interpolates the interaural level difference (ILD) and interaural time difference (ITD) for each frequency component in the spectrum. The proposed technique allows for an accurate and low-complexity interpolation of the CHRIR as well as allowing for a low-complexity audio spatialization technique which can be used for both headphones as well as loudspeakers.
{"title":"Interpolation of combined head and room impulse response for audio spatialization","authors":"S. Mehrotra, Weig-Ge Chen, Zhengyou Zhang","doi":"10.1109/MMSP.2011.6093794","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093794","url":null,"abstract":"Audio spatialization is becoming an important part of creating realistic experiences needed for immersive video conferencing and gaming. Using a combined head and room impulse response (CHRIR) has been recently proposed as an alternative to using separate head related transfer functions (HRTF) and room impulse responses (RIR). Accurate measurements of the CHRIR at various source and listener locations and orientations are needed to perform good quality audio spatialization. However, it is infeasible to accurately measure or model the CHRIR for all possible locations and orientations. Therefore, low-complexity and accurate interpolation techniques are needed to perform audio spatialization in real-time. In this paper, we present a frequency domain interpolation technique which naturally interpolates the interaural level difference (ILD) and interaural time difference (ITD) for each frequency component in the spectrum. The proposed technique allows for an accurate and low-complexity interpolation of the CHRIR as well as allowing for a low-complexity audio spatialization technique which can be used for both headphones as well as loudspeakers.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"331 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116235239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093788
J. Ascenso, Catarina Brites, F. Pereira
The advances made in channel-capacity codes, such as turbo codes and low-density parity-check (LDPC) codes, have played a major role in the emerging distributed source coding paradigm. LDPC codes can be easily adapted to new source coding strategies due to their natural representation as bipartite graphs and the use of quasi-optimal decoding algorithms, such as belief propagation. This paper tackles a relevant scenario in distributed video coding: lossy source coding when multiple side information (SI) hypotheses are available at the decoder, each one correlated with the source according to different correlation noise channels. Thus, it is proposed to exploit multiple SI hypotheses through an efficient joint decoding technique with multiple LDPC syndrome decoders that exchange information to obtain coding efficiency improvements. At the decoder side, the multiple SI hypotheses are created with motion compensated frame interpolation and fused together in a novel iterative LDPC based Slepian-Wolf decoding algorithm. With the creation of multiple SI hypotheses and the proposed decoding algorithm, bitrate savings up to 8.0% are obtained for similar decoded quality.
{"title":"Augmented LDPC graph for distributed video coding with multiple side information","authors":"J. Ascenso, Catarina Brites, F. Pereira","doi":"10.1109/MMSP.2011.6093788","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093788","url":null,"abstract":"The advances made in channel-capacity codes, such as turbo codes and low-density parity-check (LDPC) codes, have played a major role in the emerging distributed source coding paradigm. LDPC codes can be easily adapted to new source coding strategies due to their natural representation as bipartite graphs and the use of quasi-optimal decoding algorithms, such as belief propagation. This paper tackles a relevant scenario in distributed video coding: lossy source coding when multiple side information (SI) hypotheses are available at the decoder, each one correlated with the source according to different correlation noise channels. Thus, it is proposed to exploit multiple SI hypotheses through an efficient joint decoding technique with multiple LDPC syndrome decoders that exchange information to obtain coding efficiency improvements. At the decoder side, the multiple SI hypotheses are created with motion compensated frame interpolation and fused together in a novel iterative LDPC based Slepian-Wolf decoding algorithm. With the creation of multiple SI hypotheses and the proposed decoding algorithm, bitrate savings up to 8.0% are obtained for similar decoded quality.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121038962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093851
Fu Li, Guangming Shi
Mode decision in High Efficient Video Coding (HEVC) is occupied more than half of the computational complexity in intra frame coding. Block size of 4×4 is the most frequently used block in HM. In this paper, we proposed a pipelined architecture for the 4×4 intra frame mode decision in HEVC to improve the computational capability. This novel architecture consists of six-stage pipelines, and each of the pipelines can be accomplished within 24 clock cycles. In the pipeline of prediction procedure, we proposed a folded project-skip architecture for prediction. It can save the processing latency and the registers considerably. We also proposed a simplified CAVLC with low complexity in the pipeline of bits estimation procedure. The architecture for mode decision has been evaluated with TSMC 0.13μm CMOS technology. Synthesized results show that the proposed architecture only needs 99K logic gates for modes decision and can run at 165 MHz operation frequency.
{"title":"A pipelined architecture for 4×4 intra frame mode decision in the high efficiency video coding","authors":"Fu Li, Guangming Shi","doi":"10.1109/MMSP.2011.6093851","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093851","url":null,"abstract":"Mode decision in High Efficient Video Coding (HEVC) is occupied more than half of the computational complexity in intra frame coding. Block size of 4×4 is the most frequently used block in HM. In this paper, we proposed a pipelined architecture for the 4×4 intra frame mode decision in HEVC to improve the computational capability. This novel architecture consists of six-stage pipelines, and each of the pipelines can be accomplished within 24 clock cycles. In the pipeline of prediction procedure, we proposed a folded project-skip architecture for prediction. It can save the processing latency and the registers considerably. We also proposed a simplified CAVLC with low complexity in the pipeline of bits estimation procedure. The architecture for mode decision has been evaluated with TSMC 0.13μm CMOS technology. Synthesized results show that the proposed architecture only needs 99K logic gates for modes decision and can run at 165 MHz operation frequency.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123379236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093796
Zheng Chu, L. Zhuo, Yingdi Zhao, Xiaoguang Li
For the enormous number and the limited energy of network nodes in the wireless video sensor networks (WVSN) environment, to fulfil the complicated tasks, multiple sensor nodes should collaborate with each other. A cooperative multi-object tracking method for Wireless Video Sensor Networks is proposed in this paper. The proposed method is focused on the solution of cooperative multi-object tracking among multiple sensor nodes when an object leaves the view field of the tracking node. The main contributions of our proposed method are that: (1) the sensing model of a video sensor and Kalman filter is utilized to achieve optimal sensor selection. (2) Projective Invariants are employed to integrate information from the related nodes. The experimental results show that the proposed method is effective for resolving the problem of tracking relay.
{"title":"Cooperative multi-object tracking method for Wireless Video Sensor Networks","authors":"Zheng Chu, L. Zhuo, Yingdi Zhao, Xiaoguang Li","doi":"10.1109/MMSP.2011.6093796","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093796","url":null,"abstract":"For the enormous number and the limited energy of network nodes in the wireless video sensor networks (WVSN) environment, to fulfil the complicated tasks, multiple sensor nodes should collaborate with each other. A cooperative multi-object tracking method for Wireless Video Sensor Networks is proposed in this paper. The proposed method is focused on the solution of cooperative multi-object tracking among multiple sensor nodes when an object leaves the view field of the tracking node. The main contributions of our proposed method are that: (1) the sensing model of a video sensor and Kalman filter is utilized to achieve optimal sensor selection. (2) Projective Invariants are employed to integrate information from the related nodes. The experimental results show that the proposed method is effective for resolving the problem of tracking relay.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"51 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126125247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01DOI: 10.1109/MMSP.2011.6093772
Guangtao Zhai, Xiaolin Wu, Yi Niu
A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.
{"title":"A psychovisually tuned image codec","authors":"Guangtao Zhai, Xiaolin Wu, Yi Niu","doi":"10.1109/MMSP.2011.6093772","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093772","url":null,"abstract":"A psychovisual quality driven image codec exploiting the psychological and neurological process of visual perception is proposed in this paper. Recent findings in brain theory and neuroscience suggest that visual perception is a process of fitting brain's internal generative model to the outside retina stimuli. And the psychovisual quality is related to how accurately visual sensory data can be explained by the internal generative model. Therefore, the design criterion of our psychovisually tuned image compression system is to find a compact description of the optimal generative model from the input image on the encoding end, which is then used to regenerate the output image on the decoding end. By noting an important finding from empirical natural image statistics that natural images have scale invariant features in the pixels' high order statistics, the generative model can be efficiently compressed through model preserving spatial downsampling on the encoder. And the decoder can reverse the process with a model preserving upsampling module to generate the decoded image. The proposed system is fully standard complaint because the downsampled image can be compressed with any exiting codec (JPEG2000 in this work). The proposed algorithm is shown to systematically outperform JPEG2000 in a wide bit rate range in terms of both subjective and objective qualities.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126153879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a local semi-supervised learning-based algorithm for single-image super-resolution. Different from most of example-based algorithms, the information of test patches is considered during learning local regression functions which map a low-resolution patch to a high-resolution patch. Localization strategy is generally adopted in single-image super-resolution with nearest neighbor-based algorithms. However, the poor generalization of the nearest neighbor estimation decreases the performance of such algorithms. Though the problem can be fixed by local regression algorithms, the sizes of local training sets are always too small to improve the performance of nearest neighbor-based algorithms significantly. To overcome the difficulty, the semi-supervised regression algorithm is used here. Unlike supervised regression, the information about test samples is considered in semi-supervised regression algorithms, which makes the semi-supervised regression more powerful. Noticing that numerous test patches exist, the performance of nearest neighbor-based algorithms can be further improved by employing a semi-supervised regression algorithm. Experiments verify the effectiveness of the proposed algorithm.
{"title":"Local semi-supervised regression for single-image super-resolution","authors":"Yilong Tang, Xiaoli Pan, Yuan Yuan, Pingkun Yan, Luoqing Li, Xuelong Li","doi":"10.1109/MMSP.2011.6093842","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093842","url":null,"abstract":"In this paper, we propose a local semi-supervised learning-based algorithm for single-image super-resolution. Different from most of example-based algorithms, the information of test patches is considered during learning local regression functions which map a low-resolution patch to a high-resolution patch. Localization strategy is generally adopted in single-image super-resolution with nearest neighbor-based algorithms. However, the poor generalization of the nearest neighbor estimation decreases the performance of such algorithms. Though the problem can be fixed by local regression algorithms, the sizes of local training sets are always too small to improve the performance of nearest neighbor-based algorithms significantly. To overcome the difficulty, the semi-supervised regression algorithm is used here. Unlike supervised regression, the information about test samples is considered in semi-supervised regression algorithms, which makes the semi-supervised regression more powerful. Noticing that numerous test patches exist, the performance of nearest neighbor-based algorithms can be further improved by employing a semi-supervised regression algorithm. Experiments verify the effectiveness of the proposed algorithm.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132086984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01DOI: 10.1109/MMSP.2011.6093841
Weifeng Li, Marc-Antoine Nüssli, Patrick Jermann
This paper exploits the personal aspects of an individual's eye-movements in dynamic Tetris-playing environments. Effective features representing the players' eye-moving characteristics are extracted, and they are shown to be different across difference players. Delta features are also calculated to present the dynamic changes of the static features. A series of personal identification experiments are performed by using a hidden Markov models (HMM). Our experimental results show that compared with local information, modeling and tracking the dynamic temporal information (i.e., delta features) is of more importance in distinguishing different players' eye-movement. Given a 10-zoid consecutive playing signals (about 30 seconds) we can achieve an identification rate of 82.1% by combining them both.
{"title":"Exploring personal aspects using eye-tracking modality in Tetris-playing","authors":"Weifeng Li, Marc-Antoine Nüssli, Patrick Jermann","doi":"10.1109/MMSP.2011.6093841","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093841","url":null,"abstract":"This paper exploits the personal aspects of an individual's eye-movements in dynamic Tetris-playing environments. Effective features representing the players' eye-moving characteristics are extracted, and they are shown to be different across difference players. Delta features are also calculated to present the dynamic changes of the static features. A series of personal identification experiments are performed by using a hidden Markov models (HMM). Our experimental results show that compared with local information, modeling and tracking the dynamic temporal information (i.e., delta features) is of more importance in distinguishing different players' eye-movement. Given a 10-zoid consecutive playing signals (about 30 seconds) we can achieve an identification rate of 82.1% by combining them both.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116153806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-05-08DOI: 10.1109/MMSP.2011.6093827
Yang Liu, T. Li, Kai Xie
This paper presents a new paradigm for image transmission through analog error correction codes. Conventional schemes rely on digitizing images through quantization (which inevitably causes significant bandwidth expansion) and transmitting binary bit-streams through digital error correction codes (which do not automatically differentiate the different levels of significance among the bits). To strike a better overall performance in terms of transmission efficiency and quality, we propose to use a single analog error correction code in lieu of digital quantization, digital code and digital modulation. The key is to get analog coding right. We show that this can be achieved by cleverly exploiting an elegant “butterfly” property of chaotic systems. Specifically, we demonstrate a tail-biting triple-branch baker's map code and its maximum-likelihood decoding algorithm. Simulations show that the proposed analog code can actually outperform digital turbo code, one of the best codes known to date! The results and findings discussed in this paper speak volume for the promising potential of analog codes, in spite of their rather short history.
{"title":"Efficient image transmission through analog error correction","authors":"Yang Liu, T. Li, Kai Xie","doi":"10.1109/MMSP.2011.6093827","DOIUrl":"https://doi.org/10.1109/MMSP.2011.6093827","url":null,"abstract":"This paper presents a new paradigm for image transmission through analog error correction codes. Conventional schemes rely on digitizing images through quantization (which inevitably causes significant bandwidth expansion) and transmitting binary bit-streams through digital error correction codes (which do not automatically differentiate the different levels of significance among the bits). To strike a better overall performance in terms of transmission efficiency and quality, we propose to use a single analog error correction code in lieu of digital quantization, digital code and digital modulation. The key is to get analog coding right. We show that this can be achieved by cleverly exploiting an elegant “butterfly” property of chaotic systems. Specifically, we demonstrate a tail-biting triple-branch baker's map code and its maximum-likelihood decoding algorithm. Simulations show that the proposed analog code can actually outperform digital turbo code, one of the best codes known to date! The results and findings discussed in this paper speak volume for the promising potential of analog codes, in spite of their rather short history.","PeriodicalId":214459,"journal":{"name":"2011 IEEE 13th International Workshop on Multimedia Signal Processing","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122675798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}