Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579838
F. Sheikh, S. Masud, A. Loan
An improved intermediate frequency (IF) architecture for software defined radios is presented. This architecture is programmable, reconfigurable and suited to hardware implementation. The architecture is based on a computationally efficient method of extracting multiple channels belonging to two different communication standards, GSM and IS-95. The core of the system comprises of polyphase DFT filterbanks and very economical fractional rate-change filters. A flexible and efficient sample rate conversion method is also proposed that performs common rate changes using a shared hardware structure. Computational and hardware complexity comparisons are made based on results from a simulation test-bed developed for the proposed system.
{"title":"Programmable IF architecture for multi-standard software defined radios","authors":"F. Sheikh, S. Masud, A. Loan","doi":"10.1109/SIPS.2005.1579838","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579838","url":null,"abstract":"An improved intermediate frequency (IF) architecture for software defined radios is presented. This architecture is programmable, reconfigurable and suited to hardware implementation. The architecture is based on a computationally efficient method of extracting multiple channels belonging to two different communication standards, GSM and IS-95. The core of the system comprises of polyphase DFT filterbanks and very economical fractional rate-change filters. A flexible and efficient sample rate conversion method is also proposed that performs common rate changes using a shared hardware structure. Computational and hardware complexity comparisons are made based on results from a simulation test-bed developed for the proposed system.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133426913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579850
G. Glikiotis, Vassilis Paliouras
This paper introduces a novel criterion for the termination of iterations in iterative LDPC Code decoders. The proposed criterion is amenable for VLSI implementation, and it is here shown that it can enhance previously reported LDPC code decoder architectures substantially, by reducing the corresponding power dissipation. The concept of the proposed criterion is the detection of cycles in the sequences of soft words. The soft-word cycles occur in some cases of low signal-to-noise ratios and indicate that the decoder is unable to decide on a codeword, which in turn results in unnecessary power consumption due to iterations that do not improve the bit error rate. The proposed architecture terminates the decoding process when a soft-word cycle occurs, allowing for substantial power savings at a minimal performance penalty. The proposed criterion is applied to hardware-sharing and parallel decoder architectures.
{"title":"A low-power termination criterion for iterative LDPC code decoders","authors":"G. Glikiotis, Vassilis Paliouras","doi":"10.1109/SIPS.2005.1579850","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579850","url":null,"abstract":"This paper introduces a novel criterion for the termination of iterations in iterative LDPC Code decoders. The proposed criterion is amenable for VLSI implementation, and it is here shown that it can enhance previously reported LDPC code decoder architectures substantially, by reducing the corresponding power dissipation. The concept of the proposed criterion is the detection of cycles in the sequences of soft words. The soft-word cycles occur in some cases of low signal-to-noise ratios and indicate that the decoder is unable to decide on a codeword, which in turn results in unnecessary power consumption due to iterations that do not improve the bit error rate. The proposed architecture terminates the decoding process when a soft-word cycle occurs, allowing for substantial power savings at a minimal performance penalty. The proposed criterion is applied to hardware-sharing and parallel decoder architectures.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129040167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579935
D. Ko, S. Bhattacharyya
As modern image and video processing applications handle increasingly higher image resolutions, the buffering requirements between communicating functional modules increase correspondingly. The performance and cost of these applications can change dramatically depending on the implementation methods for FIFO buffers and the data delivery methods between modules. This paper introduces a new FIFO hardware mapping algorithm based on pointer-based token delivery from dataflow semantics for image and video processing applications. This approach significantly improves the performance of dataflow based implementation of image and video processing systems, and allows effective prediction of changes in performance and buffer memory requirements associated with changes in image resolution. Our pointer-based token delivery method allows indirect token delivery between actors by pointers in conjunction with use of a shared memory. Each pointer references a data block stored in the shared memory. In pointer-based token delivery, a buffer can be configured to be implemented as the combination of a small, fast FIFO and a larger, relatively cheap shared memory while providing an attractive trade-off between performance and hardware cost. We present the complete semantics of our pointer-based modeling method, systematic techniques for mapping representations using these semantics into efficient implementations, and experimental results that demonstrate the performance of the proposed pointer-based techniques.
{"title":"Modeling and optimization of buffering trade-offs for hardware implementation of image processing applications","authors":"D. Ko, S. Bhattacharyya","doi":"10.1109/SIPS.2005.1579935","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579935","url":null,"abstract":"As modern image and video processing applications handle increasingly higher image resolutions, the buffering requirements between communicating functional modules increase correspondingly. The performance and cost of these applications can change dramatically depending on the implementation methods for FIFO buffers and the data delivery methods between modules. This paper introduces a new FIFO hardware mapping algorithm based on pointer-based token delivery from dataflow semantics for image and video processing applications. This approach significantly improves the performance of dataflow based implementation of image and video processing systems, and allows effective prediction of changes in performance and buffer memory requirements associated with changes in image resolution. Our pointer-based token delivery method allows indirect token delivery between actors by pointers in conjunction with use of a shared memory. Each pointer references a data block stored in the shared memory. In pointer-based token delivery, a buffer can be configured to be implemented as the combination of a small, fast FIFO and a larger, relatively cheap shared memory while providing an attractive trade-off between performance and hardware cost. We present the complete semantics of our pointer-based modeling method, systematic techniques for mapping representations using these semantics into efficient implementations, and experimental results that demonstrate the performance of the proposed pointer-based techniques.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122969928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579839
T. PreuBer, M. Zabel, R. Spallek
This paper explores the analogies among the carry propagation within binary adders and the token passing within arbiter implementations. This analysis identifies a common design space, thus decreasing the design costs and time by efficient re-use beyond individual application domains. The immediate utilization of available carry-propagation networks is outlined and justified. This, for instance, enables designers to choose directly from a large pool of well-studied parallel prefix networks. While these solutions are, due to their regularity, favorable for VLSI ASIC designs, they do usually not synthesize well on FPGAs. Extending the analogy between carry propagation and token passing to this domain, the appropriate utilization of carry chains commonly available on FPGAs is demonstrated to yield small and fast arbiters.
{"title":"About catties and tokens: re-using adder circuits for arbitration","authors":"T. PreuBer, M. Zabel, R. Spallek","doi":"10.1109/SIPS.2005.1579839","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579839","url":null,"abstract":"This paper explores the analogies among the carry propagation within binary adders and the token passing within arbiter implementations. This analysis identifies a common design space, thus decreasing the design costs and time by efficient re-use beyond individual application domains. The immediate utilization of available carry-propagation networks is outlined and justified. This, for instance, enables designers to choose directly from a large pool of well-studied parallel prefix networks. While these solutions are, due to their regularity, favorable for VLSI ASIC designs, they do usually not synthesize well on FPGAs. Extending the analogy between carry propagation and token passing to this domain, the appropriate utilization of carry chains commonly available on FPGAs is demonstrated to yield small and fast arbiters.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124462523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579867
Haung-Chun Tseng, Cheng-Ru Chang, Y. Lin
We propose a hardware accelerator for H.264/AVC motion compensation. Our design supports all advanced features including variable-block-size motion estimation from multiple reference frames for both P and B slices, quarter-pixel accuracy, and weighted bi-directional prediction. We pay special attention to memory subsystem design for optimizing both memory usage and memory bandwidth. We have integrated the accelerator into an H.264/AVC main profile decoder in FPGA prototype. Compared with previous work, our accelerator is smaller and faster.
{"title":"A hardware accelerator for H.264/AVC motion compensation","authors":"Haung-Chun Tseng, Cheng-Ru Chang, Y. Lin","doi":"10.1109/SIPS.2005.1579867","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579867","url":null,"abstract":"We propose a hardware accelerator for H.264/AVC motion compensation. Our design supports all advanced features including variable-block-size motion estimation from multiple reference frames for both P and B slices, quarter-pixel accuracy, and weighted bi-directional prediction. We pay special attention to memory subsystem design for optimizing both memory usage and memory bandwidth. We have integrated the accelerator into an H.264/AVC main profile decoder in FPGA prototype. Compared with previous work, our accelerator is smaller and faster.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121888763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579883
Hyuchang Im, Wonchul Lee, Wonyong Sung
We studied the efficient implementation of a motion estimation algorithm for H.264/AVC on TMS 320C64x, a VLIW (very long instruction word) SIMD (single instruction multiple data) digital signal processor. H.264 motion estimation algorithms demand much arithmetic operations especially because of the variable block size optimization. The SAD (sum of absolute difference) reuse method is chosen not only to reduce the computation but also to utilize the regular algorithmic structure, which is essential for efficient implementation in parallel and pipelined processors. We applied a few techniques, such as loop length increase for efficient software pipelining, multi-block SAD computation for reducing memory access overhead, block processing for cache miss minimization, and improved quarter-pixel processing. The implementation results show that a real-time implementation of ME for D1 size (720*480) video is possible using a 720 MHz TMS320C6416 digital signal processor.
{"title":"Implementation of an H.264 motion estimation algorithm on a VLIW programmable digital signal processor","authors":"Hyuchang Im, Wonchul Lee, Wonyong Sung","doi":"10.1109/SIPS.2005.1579883","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579883","url":null,"abstract":"We studied the efficient implementation of a motion estimation algorithm for H.264/AVC on TMS 320C64x, a VLIW (very long instruction word) SIMD (single instruction multiple data) digital signal processor. H.264 motion estimation algorithms demand much arithmetic operations especially because of the variable block size optimization. The SAD (sum of absolute difference) reuse method is chosen not only to reduce the computation but also to utilize the regular algorithmic structure, which is essential for efficient implementation in parallel and pipelined processors. We applied a few techniques, such as loop length increase for efficient software pipelining, multi-block SAD computation for reducing memory access overhead, block processing for cache miss minimization, and improved quarter-pixel processing. The implementation results show that a real-time implementation of ME for D1 size (720*480) video is possible using a 720 MHz TMS320C6416 digital signal processor.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116022072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579865
Q. Ho, D. Massicotte
The hardware implementation issues of multiuser interference cancellation techniques for multirate asynchronous direct-sequence code division multi-access (DS-CDMA) systems based on variable spreading factor (VSF) are investigated. Based on an algorithm for monorate systems based on cascade adaptive filter multi-user detector (CF-MUD), an analysis is done to choose the best tradeoffs between hardware implementation and algorithmic performance in the third generation (3G) communication scenarios. We investigate two popular techniques, namely low-rate detector (LRD) and high-rate detector (HRD). The goal aims to extend the CF-MUD algorithm and reuse its FPGA-targeted architectures that we previously developed for multirate systems. The developed architectures can be used as an intellectual property (IP) core in a system on a programmable chip (SOPC) based on Xilinx/sup /spl copy// Virtex II Pro and Virtex II processing MUD function for asynchronous multirate systems.
研究了基于可变扩频因子(VSF)的多速率异步直序码分多址(DS-CDMA)系统多用户干扰消除技术的硬件实现问题。基于一种基于级联自适应滤波多用户检测器(CF-MUD)的单系统算法,分析了第三代(3G)通信场景中硬件实现和算法性能之间的最佳权衡。我们研究了两种流行的技术,即低速率检测器(LRD)和高速率检测器(HRD)。我们的目标是扩展CF-MUD算法,并重用我们之前为多速率系统开发的fpga目标架构。所开发的架构可以作为基于Xilinx/sup /spl copy// Virtex II Pro和Virtex II处理MUD功能的可编程芯片(SOPC)系统的知识产权(IP)核心,用于异步多速率系统。
{"title":"Hardware implementation issues of cascade filters MUD for multirate WCDMA systems","authors":"Q. Ho, D. Massicotte","doi":"10.1109/SIPS.2005.1579865","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579865","url":null,"abstract":"The hardware implementation issues of multiuser interference cancellation techniques for multirate asynchronous direct-sequence code division multi-access (DS-CDMA) systems based on variable spreading factor (VSF) are investigated. Based on an algorithm for monorate systems based on cascade adaptive filter multi-user detector (CF-MUD), an analysis is done to choose the best tradeoffs between hardware implementation and algorithmic performance in the third generation (3G) communication scenarios. We investigate two popular techniques, namely low-rate detector (LRD) and high-rate detector (HRD). The goal aims to extend the CF-MUD algorithm and reuse its FPGA-targeted architectures that we previously developed for multirate systems. The developed architectures can be used as an intellectual property (IP) core in a system on a programmable chip (SOPC) based on Xilinx/sup /spl copy// Virtex II Pro and Virtex II processing MUD function for asynchronous multirate systems.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126433913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579927
T. Clerckx, A. Munteanu, J. Cornelis, P. Schelkens
Scalable wavelet-based video coding is of paramount importance in applications in order to adapt the video content to the channel conditions and to enable the selection of a variety of user preferences in terms of quality, resolution and frame-rate. Apart of this, a growing number of mobile devices on which video has to be decoded, stresses the need for complexity scalability to cope with their limited computational capabilities. In this context, the paper focuses on wavelet-based scalable video coding based on in-band motion-compensated temporal filtering and addresses the problem of achieving complexity scalability in the complete-to-overcomplete discrete wavelet transform (CODWT) module employed by such architectures. The proposed methods demonstrate that this can be achieved at the cost of a limited and controllable penalty in the overall coding performance.
{"title":"Complexity scalability in video coding based on in-band motion-compensated temporal filtering","authors":"T. Clerckx, A. Munteanu, J. Cornelis, P. Schelkens","doi":"10.1109/SIPS.2005.1579927","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579927","url":null,"abstract":"Scalable wavelet-based video coding is of paramount importance in applications in order to adapt the video content to the channel conditions and to enable the selection of a variety of user preferences in terms of quality, resolution and frame-rate. Apart of this, a growing number of mobile devices on which video has to be decoded, stresses the need for complexity scalability to cope with their limited computational capabilities. In this context, the paper focuses on wavelet-based scalable video coding based on in-band motion-compensated temporal filtering and addresses the problem of achieving complexity scalability in the complete-to-overcomplete discrete wavelet transform (CODWT) module employed by such architectures. The proposed methods demonstrate that this can be achieved at the cost of a limited and controllable penalty in the overall coding performance.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133183148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579945
A. Doulamis, N. Doulamis
Efficient video content adaptation requires techniques for content analysis and the development of appropriate mechanisms for content scaling in terms of the network properties, terminal devices characteristics and users' preferences. In this paper, we propose an adaptive optimal rate distortion scheme able to allocate different priorities for each video object with respect to the users' needs, network platform capabilities and terminal characteristics without violating the target bit rate of the sequence. In this paper, we consider that video objects have been already detected by a content segmentation algorithm. The proposed scheme minimizes the effects of objects on non-interest compared to objects of interest.
{"title":"Optimal object-based scalability for video content adaptation according to the usage environment","authors":"A. Doulamis, N. Doulamis","doi":"10.1109/SIPS.2005.1579945","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579945","url":null,"abstract":"Efficient video content adaptation requires techniques for content analysis and the development of appropriate mechanisms for content scaling in terms of the network properties, terminal devices characteristics and users' preferences. In this paper, we propose an adaptive optimal rate distortion scheme able to allocate different priorities for each video object with respect to the users' needs, network platform capabilities and terminal characteristics without violating the target bit rate of the sequence. In this paper, we consider that video objects have been already detected by a content segmentation algorithm. The proposed scheme minimizes the effects of objects on non-interest compared to objects of interest.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130653666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1900-01-01DOI: 10.1109/SIPS.2005.1579915
D. Novo, W. Moffat, V. Derudder, B. Bougard
The increasing demand for multimodal wireless communication is driving designers towards software defined radio (SDR). Therefore, new high performance reconfigurable platforms for baseband digital signal processing are required. Due to their flexibility, with low reconfiguration overhead, performance and energy efficiency, coarse grain reconfigurable arrays (CGRAs) are good candidates to fulfil this need. ADRES is a CGRA that combines a VLIW processor with a reconfigurable coarse-grain array. In this paper, we analyze the mapping on ADRES of one of the most demanding wireless OFDM DSP algorithms: the space division multiplexing (SDM) receiver. The latter will probably be mandatory in the next WLAN generation (802.11n). We also compare the obtained results with a mapping onto a VLIW processor, showing a gain of 5 in performance and a factor 1.75 in power efficiency.
{"title":"Mapping a multiple antenna SDM-OFDM receiver on the ADRES coarse-grained reconfigurable processor","authors":"D. Novo, W. Moffat, V. Derudder, B. Bougard","doi":"10.1109/SIPS.2005.1579915","DOIUrl":"https://doi.org/10.1109/SIPS.2005.1579915","url":null,"abstract":"The increasing demand for multimodal wireless communication is driving designers towards software defined radio (SDR). Therefore, new high performance reconfigurable platforms for baseband digital signal processing are required. Due to their flexibility, with low reconfiguration overhead, performance and energy efficiency, coarse grain reconfigurable arrays (CGRAs) are good candidates to fulfil this need. ADRES is a CGRA that combines a VLIW processor with a reconfigurable coarse-grain array. In this paper, we analyze the mapping on ADRES of one of the most demanding wireless OFDM DSP algorithms: the space division multiplexing (SDM) receiver. The latter will probably be mandatory in the next WLAN generation (802.11n). We also compare the obtained results with a mapping onto a VLIW processor, showing a gain of 5 in performance and a factor 1.75 in power efficiency.","PeriodicalId":436123,"journal":{"name":"IEEE Workshop on Signal Processing Systems Design and Implementation, 2005.","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131985471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}