Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387526
Fu Hong-liang, Feng Guangzeng
A novel STBC scheme in MIMO CDMA system is presented. In the proposed scheme, the input symbols are split in blocks, each block has M symbols (M is the number of transmitting antennas), the M symbols are circularly encoded into M groups (each group has M cyclic symbols), the encoded M × M symbols are transmitted in Msymbol durations through Mantennas by multiplying M spread-codes respectively. The proposed STBC scheme has full coding rate (rate 1) and full diversity order but the decoding method is as simple as the conventional STBC introduced by Alamouti [1].
{"title":"A Novel STBC Scheme in MIMO CDMA System","authors":"Fu Hong-liang, Feng Guangzeng","doi":"10.1109/SIPS.2007.4387526","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387526","url":null,"abstract":"A novel STBC scheme in MIMO CDMA system is presented. In the proposed scheme, the input symbols are split in blocks, each block has M symbols (M is the number of transmitting antennas), the M symbols are circularly encoded into M groups (each group has M cyclic symbols), the encoded M × M symbols are transmitted in Msymbol durations through Mantennas by multiplying M spread-codes respectively. The proposed STBC scheme has full coding rate (rate 1) and full diversity order but the decoding method is as simple as the conventional STBC introduced by Alamouti [1].","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"1 1","pages":"101-104"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81872707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387547
Zhiqiang Cui, Zhongfeng Wang
This paper studies practical low complexity decoding of Low Density Parity-Check (LDPC) codes. We first investigate VLSI implementation issues of two state-of-the-art Weighted Bit Flipping (WBF) based decoding algorithms that were recently proposed in the literature. Then we present an optimized 2-bit soft decoding approach. It is shown that the proposed approach has comparable hardware complexity with either of the two WBF-based algorithms while it has significantly better decoding performance.
{"title":"Studies on Practical Low Complexity Decoding of Low-Density Parity-Check Codes","authors":"Zhiqiang Cui, Zhongfeng Wang","doi":"10.1109/SIPS.2007.4387547","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387547","url":null,"abstract":"This paper studies practical low complexity decoding of Low Density Parity-Check (LDPC) codes. We first investigate VLSI implementation issues of two state-of-the-art Weighted Bit Flipping (WBF) based decoding algorithms that were recently proposed in the literature. Then we present an optimized 2-bit soft decoding approach. It is shown that the proposed approach has comparable hardware complexity with either of the two WBF-based algorithms while it has significantly better decoding performance.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"26 1","pages":"216-221"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84041913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387600
Yang Song, Ming Shao, Zhenyu Liu, Shen Li, Lingfeng Li, T. Ikenaga, S. Goto
H.264/AVC fractional motion estimation (FME) engine for HDTV1080p is proposed in this paper. In order to provide real-time processing capability with reasonable hardware cost, several techniques have been presented. Firstly, the H.264/AVC is optimized and only 1 reference frame and block modes above 8 × 8 are supported. Therefore, the computation is reduced to 11.4% and the PSNR loss is only 0.1dB. Secondly, the lossless inside-mode and cross-mode reusing techniques are adopted, which can reduce about 65% pixel generation and SATD calculation. Thirdly, the lossless optimized FME scheduling is used to remove the pipeline bubbles between adjacent 1/2-pel and 1/4-pel FME. The proposed FME engine is realized with TSMC 0.18¿m 1P6M CMOS technology and costs 203.2K gates and 52.8KB SRAM. Under 200MHz frequency, the proposed FME engine can real-time encode HDTV1080p at 30fps with 236mW power cost.
{"title":"H.264/AVC Fractional Motion Estimation Engine with Computation Reusing in HDTV1080P Real-Time Encoding Applications","authors":"Yang Song, Ming Shao, Zhenyu Liu, Shen Li, Lingfeng Li, T. Ikenaga, S. Goto","doi":"10.1109/SIPS.2007.4387600","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387600","url":null,"abstract":"H.264/AVC fractional motion estimation (FME) engine for HDTV1080p is proposed in this paper. In order to provide real-time processing capability with reasonable hardware cost, several techniques have been presented. Firstly, the H.264/AVC is optimized and only 1 reference frame and block modes above 8 × 8 are supported. Therefore, the computation is reduced to 11.4% and the PSNR loss is only 0.1dB. Secondly, the lossless inside-mode and cross-mode reusing techniques are adopted, which can reduce about 65% pixel generation and SATD calculation. Thirdly, the lossless optimized FME scheduling is used to remove the pipeline bubbles between adjacent 1/2-pel and 1/4-pel FME. The proposed FME engine is realized with TSMC 0.18¿m 1P6M CMOS technology and costs 203.2K gates and 52.8KB SRAM. Under 200MHz frequency, the proposed FME engine can real-time encode HDTV1080p at 30fps with 236mW power cost.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"1 1","pages":"509-514"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88899187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387599
Guo-Shiuan Yu, Tian-Sheuan Chang
Long initial access cycles of SDRAM are the major performance burden of motion compensation in a video decoder. To minimize its effect while improve overall available memory bandwidth, this paper presents an optimal data mapping scheme for motion compensation in H.264 video coding. This scheme allocates the video data into suitable address and bank according to the access characteristics of SDRAM access and address transition in motion compensation. The resulted allocation can reduce the required bandwidth of motion compensation by 36% when compared to the previous design for 525SD video sequences.
{"title":"Optimal Data Mapping for Motion Compensation in H.264 Video Decoding","authors":"Guo-Shiuan Yu, Tian-Sheuan Chang","doi":"10.1109/SIPS.2007.4387599","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387599","url":null,"abstract":"Long initial access cycles of SDRAM are the major performance burden of motion compensation in a video decoder. To minimize its effect while improve overall available memory bandwidth, this paper presents an optimal data mapping scheme for motion compensation in H.264 video coding. This scheme allocates the video data into suitable address and bank according to the access characteristics of SDRAM access and address transition in motion compensation. The resulted allocation can reduce the required bandwidth of motion compensation by 36% when compared to the previous design for 525SD video sequences.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"217 1","pages":"505-508"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79696616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387535
Sun-Ah Hong, Yong-Eun Kim, Jin-Gyun Chung, Sung-Chul Lee
The partial product matrix (PPM) of a squarer is symmetric. To reduce the depth of PPM, it can be folded, shifted and rearranged. In this paper, we propose a squarer design method using partial product grouping method. The proposed squarers lead to up to 24.7%, 24.4% and 6.7% reduction in area, power consumption and propagation delay compared with conventional squarers.
{"title":"Efficient Squarer Design Using Group Partial Products","authors":"Sun-Ah Hong, Yong-Eun Kim, Jin-Gyun Chung, Sung-Chul Lee","doi":"10.1109/SIPS.2007.4387535","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387535","url":null,"abstract":"The partial product matrix (PPM) of a squarer is symmetric. To reduce the depth of PPM, it can be folded, shifted and rearranged. In this paper, we propose a squarer design method using partial product grouping method. The proposed squarers lead to up to 24.7%, 24.4% and 6.7% reduction in area, power consumption and propagation delay compared with conventional squarers.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"18 1","pages":"146-150"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78338259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387525
Ying-Ren Chien, Yen-Ting Tu, H. Tsao, W. Mao
Unlike 1000BASE-T system, the far-end crosstalk (FEXT) must be suppressed by at least 20 dB to meet the high speed transmission requirement for 10GBASE-T. Without FEXT cancellation, the average decision-point signal-to-noise ratio (DP-SNR) can degrade by 3 dB. This paper presents a multi-input multi-output Tomlinson-Harashima precoding (MIMO THP) technique to equalize the channel and to cancel the FEXT interference. Besides, the corresponding training method to deal with delay skew among channels and the arrangement of different step-size in least mean square (LMS) adaptive algorithm are proposed as well. Simulation results show that delay skew compensation and step-sizes arrangement can improve DP-SNR by 4.59 dB and 1.62 dB, respectively. The proposed MIMO THP architecture improves the DP-SNR by 2.75 dB than The tenative decision based approach.
{"title":"Equalization and Interference Cancellation with MIMO THP for 10GBASE-T","authors":"Ying-Ren Chien, Yen-Ting Tu, H. Tsao, W. Mao","doi":"10.1109/SIPS.2007.4387525","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387525","url":null,"abstract":"Unlike 1000BASE-T system, the far-end crosstalk (FEXT) must be suppressed by at least 20 dB to meet the high speed transmission requirement for 10GBASE-T. Without FEXT cancellation, the average decision-point signal-to-noise ratio (DP-SNR) can degrade by 3 dB. This paper presents a multi-input multi-output Tomlinson-Harashima precoding (MIMO THP) technique to equalize the channel and to cancel the FEXT interference. Besides, the corresponding training method to deal with delay skew among channels and the arrangement of different step-size in least mean square (LMS) adaptive algorithm are proposed as well. Simulation results show that delay skew compensation and step-sizes arrangement can improve DP-SNR by 4.59 dB and 1.62 dB, respectively. The proposed MIMO THP architecture improves the DP-SNR by 2.75 dB than The tenative decision based approach.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"53 1","pages":"95-100"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74761737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387533
Y. Hwang, Jin-Fa Lin, M. Sheu, Chia-Jen Sheu
In this paper, we proposed two novel low power multipliers based on enhanced row bypassing schemes. The essence of the power saving idea is eliminating unnecessary computation via signal bypassing. In an array multiplier, futile computations occur on those columns or rows of adder corresponding to zero bits in the input operands. Previous designs resort to input gating and output multiplexing to accomplish signal bypassing. The proposed designs, however, successfully resolve the adverse DC power consumption problem due to voltage loss in gated signals and implement the multiplexing mechanism cleverly via clock CMOS (C2MOS) circuitry. Two versions of the design are proposed with one emphasizing on maximizing power saving and the other focusing on reduced circuit complexity. The circuit overheads of both designs are confined to 23.4% and 12.8%, respectively. The proposed designs also achieve better and consistent power saving than previous work under a wide range of Vdd and the power saving can be as high as 17%.
{"title":"Low Power Multipliers Using Enhenced Row Bypassing Schemes","authors":"Y. Hwang, Jin-Fa Lin, M. Sheu, Chia-Jen Sheu","doi":"10.1109/SIPS.2007.4387533","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387533","url":null,"abstract":"In this paper, we proposed two novel low power multipliers based on enhanced row bypassing schemes. The essence of the power saving idea is eliminating unnecessary computation via signal bypassing. In an array multiplier, futile computations occur on those columns or rows of adder corresponding to zero bits in the input operands. Previous designs resort to input gating and output multiplexing to accomplish signal bypassing. The proposed designs, however, successfully resolve the adverse DC power consumption problem due to voltage loss in gated signals and implement the multiplexing mechanism cleverly via clock CMOS (C2MOS) circuitry. Two versions of the design are proposed with one emphasizing on maximizing power saving and the other focusing on reduced circuit complexity. The circuit overheads of both designs are confined to 23.4% and 12.8%, respectively. The proposed designs also achieve better and consistent power saving than previous work under a wide range of Vdd and the power saving can be as high as 17%.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"14 1","pages":"136-141"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72830488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387618
Guangtao Zhai, Wenjun Zhang, Yi Xu, Weisi Lin
Phase map of the images captures the most fundamental cognitive features and thus is widely used in various digital image processing tasks. In this paper, we propose the Log Gabor Phase Similarity (LGPS), a novel full reference image quality assessment metrics based on measuring of similarities between phases in log Gabor transform domain. Phase can capture any changes in image details regardless of the fluctuation in contrast, and the similarity between phase maps provides a measure of the perceptual quality of images. An image is firstly decomposed by a filter bank consisting of a pair of log Gabor filters. The phase maps are then computed from the responses of each filter pair. We have developed a window-based similarity metric to evaluate the resemblance between phase maps so as to measure the quality of the image. Experimental results and comparative studies suggest that LGPS can be used to predict the perceived quality of images with different distortions.
{"title":"LGPS: Phase Based Image Quality Assessment Metric","authors":"Guangtao Zhai, Wenjun Zhang, Yi Xu, Weisi Lin","doi":"10.1109/SIPS.2007.4387618","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387618","url":null,"abstract":"Phase map of the images captures the most fundamental cognitive features and thus is widely used in various digital image processing tasks. In this paper, we propose the Log Gabor Phase Similarity (LGPS), a novel full reference image quality assessment metrics based on measuring of similarities between phases in log Gabor transform domain. Phase can capture any changes in image details regardless of the fluctuation in contrast, and the similarity between phase maps provides a measure of the perceptual quality of images. An image is firstly decomposed by a filter bank consisting of a pair of log Gabor filters. The phase maps are then computed from the responses of each filter pair. We have developed a window-based similarity metric to evaluate the resemblance between phase maps so as to measure the quality of the image. Experimental results and comparative studies suggest that LGPS can be used to predict the perceived quality of images with different distortions.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"21 1","pages":"605-609"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/SIPS.2007.4387618","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72500295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387524
Youhong Lu, Guodong Shi, Jiansheng Zhang
Surrounded noises degrade both audio and video qualities in a multimedia environment. This paper applies Gabor expansion to passive noise reduction of speech signals. The noise reduction method and performance via Gabor expansion are studied. The Gabor expansion with Gaussian prototype function has a property that its time and frequency product is minimal. The property makes the noise-only segment more clearly and the noise estimate more robustly.
{"title":"Audio Enhancement via Noise Reduction through Gabor Expansion","authors":"Youhong Lu, Guodong Shi, Jiansheng Zhang","doi":"10.1109/SIPS.2007.4387524","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387524","url":null,"abstract":"Surrounded noises degrade both audio and video qualities in a multimedia environment. This paper applies Gabor expansion to passive noise reduction of speech signals. The noise reduction method and performance via Gabor expansion are studied. The Gabor expansion with Gaussian prototype function has a property that its time and frequency product is minimal. The property makes the noise-only segment more clearly and the noise estimate more robustly.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"7 1","pages":"90-94"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85450710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2007-11-21DOI: 10.1109/SIPS.2007.4387591
J. Boutellier, S. Bhattacharyya, O. Silvén
In this paper, we present four scheduling algorithms that provide flexible utilization of fine-grain DSP accelerators with low run-time overhead. Methods that have originally been used in operations research are implemented in a way that minimizes the amount of run-time computations. These low overhead scheduling methods can be used for synchronization in multi-processor systems, especially when dedicated co-processors implement tasks with low turnaround times. We demonstrate our methods by an application to MPEG-4 video decoding. In this demonstration, MPEG-4 macroblock decoding is modeled as a permutation flowshop problem and our proposed algorithms are applied to schedule co-processors that implement MPEG-4 block decoding operations. Experimental results demonstrate the effectiveness of our scheduling approach.
{"title":"Low-Overhead Run-Time Scheduling for Fine-Grained Acceleration of Signal Processing Systems","authors":"J. Boutellier, S. Bhattacharyya, O. Silvén","doi":"10.1109/SIPS.2007.4387591","DOIUrl":"https://doi.org/10.1109/SIPS.2007.4387591","url":null,"abstract":"In this paper, we present four scheduling algorithms that provide flexible utilization of fine-grain DSP accelerators with low run-time overhead. Methods that have originally been used in operations research are implemented in a way that minimizes the amount of run-time computations. These low overhead scheduling methods can be used for synchronization in multi-processor systems, especially when dedicated co-processors implement tasks with low turnaround times. We demonstrate our methods by an application to MPEG-4 video decoding. In this demonstration, MPEG-4 macroblock decoding is modeled as a permutation flowshop problem and our proposed algorithms are applied to schedule co-processors that implement MPEG-4 block decoding operations. Experimental results demonstrate the effectiveness of our scheduling approach.","PeriodicalId":93225,"journal":{"name":"Proceedings. IEEE Workshop on Signal Processing Systems (2007-2014)","volume":"10 1","pages":"457-462"},"PeriodicalIF":0.0,"publicationDate":"2007-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85289082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}