Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957325
Ching-wen Wang, Yun-Nan Chang
A novel design of Viterbi (1965) decoder based on in-place state metric update and hybrid survivor path management is presented. For those Viterbi applications with large constraint length, the proposed design methodology can result in high-speed and modular architectures by exploiting the in-place computation feature of the Viterbi algorithm. This feature is not only applied to the design of highly regular add-compare-select (ACS) units, but also exploited in the design of trace-back units for the first time. The proposed hybrid survivor path management based on the combination of register-exchange and trace-back schemes cannot only reduce the number of memory operations, but also the size of memory required. Compared with the general hybrid trace-back structure, the overhead of the register-exchange circuit in our architecture is significantly less. Therefore, the proposed architecture can find promising applications in digital communication systems where high-speed large state Viterbi decoders are desirable.
{"title":"Design of Viterbi decoders with in-place state metric update and hybrid traceback processing","authors":"Ching-wen Wang, Yun-Nan Chang","doi":"10.1109/SIPS.2001.957325","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957325","url":null,"abstract":"A novel design of Viterbi (1965) decoder based on in-place state metric update and hybrid survivor path management is presented. For those Viterbi applications with large constraint length, the proposed design methodology can result in high-speed and modular architectures by exploiting the in-place computation feature of the Viterbi algorithm. This feature is not only applied to the design of highly regular add-compare-select (ACS) units, but also exploited in the design of trace-back units for the first time. The proposed hybrid survivor path management based on the combination of register-exchange and trace-back schemes cannot only reduce the number of memory operations, but also the size of memory required. Compared with the general hybrid trace-back structure, the overhead of the register-exchange circuit in our architecture is significantly less. Therefore, the proposed architecture can find promising applications in digital communication systems where high-speed large state Viterbi decoders are desirable.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115479684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957334
L. Imbert, G. Jullien
We investigate efficiencies that may be introduced into the fault-tolerant modulus replication RNS (MRRNS) system by restricting the data sample polynomials to be even. We refer to this as the symmetrical MRRNS (SMRRNS) technique.
{"title":"Efficient fault-tolerant arithmetic using a symmetrical modulus replication RNS","authors":"L. Imbert, G. Jullien","doi":"10.1109/SIPS.2001.957334","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957334","url":null,"abstract":"We investigate efficiencies that may be introduced into the fault-tolerant modulus replication RNS (MRRNS) system by restricting the data sample polynomials to be even. We refer to this as the symmetrical MRRNS (SMRRNS) technique.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116527361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957353
Chuxiang Li, Jianhua Lu, Jun Gu, Ming L. Liou
An error-resilient scheme incorporated with the MPEG-2 standard is developed to support robust video transmission in digital terrestrial TV broadcasting (DTTB) systems. In particular, a novel concealment algorithm based on temporal error concealment and block-matching methodology is proposed. This algorithm achieves an effective concealment while keeping low computational complexity with small size of search-window for block-matching. Likewise, an effective reception for isolate I-pictures is developed. Moreover, combining with an efficient detection of spatial/temporal activity, an adaptive error concealment scheme is further contrived. Extensive simulations have confirmed that the proposed error-resilient schemes may achieve efficient and robust video transmission in a DTTB system even with a very high packet error rate.
{"title":"Error resilience schemes for digital terrestrial TV broadcasting system","authors":"Chuxiang Li, Jianhua Lu, Jun Gu, Ming L. Liou","doi":"10.1109/SIPS.2001.957353","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957353","url":null,"abstract":"An error-resilient scheme incorporated with the MPEG-2 standard is developed to support robust video transmission in digital terrestrial TV broadcasting (DTTB) systems. In particular, a novel concealment algorithm based on temporal error concealment and block-matching methodology is proposed. This algorithm achieves an effective concealment while keeping low computational complexity with small size of search-window for block-matching. Likewise, an effective reception for isolate I-pictures is developed. Moreover, combining with an efficient detection of spatial/temporal activity, an adaptive error concealment scheme is further contrived. Extensive simulations have confirmed that the proposed error-resilient schemes may achieve efficient and robust video transmission in a DTTB system even with a very high packet error rate.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117088242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957367
R. Osorio, B. Vanhoof
In state-of-the-art multimedia compression standards, arithmetic coding is widely used as a powerful entropy compression method. In the MPEG-4 standard a specific 4-symbol, multiple-context arithmetic coder is used for wavelet based image compression. We present an architecture capable of processing close to 1 symbol per cycle, managing a multiple context in a simple, yet cost-efficient manner. A peak performance of 200 Mbit/s is achieved when clocking this architecture at 100 MHz.
{"title":"200 Mbit/s 4-symbol arithmetic encoder architecture for embedded zero tree-based compression","authors":"R. Osorio, B. Vanhoof","doi":"10.1109/SIPS.2001.957367","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957367","url":null,"abstract":"In state-of-the-art multimedia compression standards, arithmetic coding is widely used as a powerful entropy compression method. In the MPEG-4 standard a specific 4-symbol, multiple-context arithmetic coder is used for wavelet based image compression. We present an architecture capable of processing close to 1 symbol per cycle, managing a multiple context in a simple, yet cost-efficient manner. A peak performance of 200 Mbit/s is achieved when clocking this architecture at 100 MHz.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129311430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957370
N. Roma, L. Sousa
A new class of fully parameterizable multiple array architectures for motion estimation (ME) in video sequences based on the full search block matching (FSBM) algorithm is proposed in this paper. This class is based on a new and efficient AB2 single array architecture with minimum latency, maximum throughput and full utilization of the hardware resources. It provides the ability to configure the target processors according to the setup parameters, the processing time and the circuit area specified limits. With this purpose, a software configuration tool has been implemented to determine the set of possible configurations which fulfill the requisites of the video coder, providing the ability to automatically generate the VHDL description of the selected configuration. The implementation of a single array processor configuration on a single-chip is presented. Experimental results evidence the ability to estimate motion vectors in real-time with this configuration.
{"title":"Parameterizable hardware architectures for automatic synthesis of motion estimation processors","authors":"N. Roma, L. Sousa","doi":"10.1109/SIPS.2001.957370","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957370","url":null,"abstract":"A new class of fully parameterizable multiple array architectures for motion estimation (ME) in video sequences based on the full search block matching (FSBM) algorithm is proposed in this paper. This class is based on a new and efficient AB2 single array architecture with minimum latency, maximum throughput and full utilization of the hardware resources. It provides the ability to configure the target processors according to the setup parameters, the processing time and the circuit area specified limits. With this purpose, a software configuration tool has been implemented to determine the set of possible configurations which fulfill the requisites of the video coder, providing the ability to automatically generate the VHDL description of the selected configuration. The implementation of a single array processor configuration on a single-chip is presented. Experimental results evidence the ability to estimate motion vectors in real-time with this configuration.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"1165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134519408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957346
V. Moshnyaga
A new architectural technique to reduce energy dissipation of frame memory is proposed. Unlike existing approaches, the technique exploits the pixel correlation in video sequences, dynamically adjusting the memory bit-width to the number of bits changed per pixel. Instead of treating the data bits independently, we group the most significant bits together, activating the corresponding group of bit-lines adaptively to data variation. The method is not restricted to the specific bit-patterns nor depends on the storage phase. It works equally well on read and write accesses, as well as during precharging. Simulation results show that using this method we can reduce the total energy consumption of frame memory by 20% without affecting the picture quality.
{"title":"Adaptive bit-width compression for low-energy frame memory design","authors":"V. Moshnyaga","doi":"10.1109/SIPS.2001.957346","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957346","url":null,"abstract":"A new architectural technique to reduce energy dissipation of frame memory is proposed. Unlike existing approaches, the technique exploits the pixel correlation in video sequences, dynamically adjusting the memory bit-width to the number of bits changed per pixel. Instead of treating the data bits independently, we group the most significant bits together, activating the corresponding group of bit-lines adaptively to data variation. The method is not restricted to the specific bit-patterns nor depends on the storage phase. It works equally well on read and write accesses, as well as during precharging. Simulation results show that using this method we can reduce the total energy consumption of frame memory by 20% without affecting the picture quality.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132357754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957358
Quynh-Lien Nguyen-Phuc, C.M. Sorolla
This paper presents the architecture of a hardware block supporting the real-time rendering of all 2D natural or synthetic visual objects proposed by the MPEG-4 standard as well as sprite decoding. It is compliant to main profile, Level3 and hybrid visual profile. A software model allows us to validate the visual quality of the rendered scene. The complexity of this architecture is evaluated and the architectural choices are validated by means of simulations of a behavioral model.
{"title":"VLSI architecture of a MPEG-4 visual renderer","authors":"Quynh-Lien Nguyen-Phuc, C.M. Sorolla","doi":"10.1109/SIPS.2001.957358","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957358","url":null,"abstract":"This paper presents the architecture of a hardware block supporting the real-time rendering of all 2D natural or synthetic visual objects proposed by the MPEG-4 standard as well as sprite decoding. It is compliant to main profile, Level3 and hybrid visual profile. A software model allows us to validate the visual quality of the rendered scene. The complexity of this architecture is evaluated and the architectural choices are validated by means of simulations of a behavioral model.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133359452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957339
Yun-Nan Chang, Yan-Sheng Li
This paper presents a design methodology for the implementation of high-performance 2-D discrete wavelet transform (DWT) and 2-D inverse DWT (IDWT). By exploiting the multi-rate feature inherent in the algorithms, an effective schedule that interleaves all the row-wise and column-wise computations of different octaves onto three fundamental convolutional filters is proposed. Based on this computation schedule, very high efficient architectures can be synthesized. The resulting architectures cannot only achieve fast computation time at less silicon cost due to nearly full hardware utilization, but they are also simple and modular, making them very suitable for VLSI implementation. Furthermore, the proposed design methodology enables the design of the configurable architecture that can process both DWT and IDWT.
{"title":"Design of highly efficient VLSI architectures for 2-D DWT and 2-D IDWT","authors":"Yun-Nan Chang, Yan-Sheng Li","doi":"10.1109/SIPS.2001.957339","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957339","url":null,"abstract":"This paper presents a design methodology for the implementation of high-performance 2-D discrete wavelet transform (DWT) and 2-D inverse DWT (IDWT). By exploiting the multi-rate feature inherent in the algorithms, an effective schedule that interleaves all the row-wise and column-wise computations of different octaves onto three fundamental convolutional filters is proposed. Based on this computation schedule, very high efficient architectures can be synthesized. The resulting architectures cannot only achieve fast computation time at less silicon cost due to nearly full hardware utilization, but they are also simple and modular, making them very suitable for VLSI implementation. Furthermore, the proposed design methodology enables the design of the configurable architecture that can process both DWT and IDWT.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131461675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957350
W. Wolf, I. Burak Ozer
This paper describes a smart camera system under development at Princeton University. This smart camera is designed for use in a smart room in which the camera detects the presence of a person in its visual field and determines when various gestures are made by the person. As a first step toward a VLSI implementation, we use Trimedia processors hosted by a PC. This paper describes the relationship between the algorithms used for human activity detection and the architectures required to perform these tasks in real time.
{"title":"A smart camera for real-time human activity recognition","authors":"W. Wolf, I. Burak Ozer","doi":"10.1109/SIPS.2001.957350","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957350","url":null,"abstract":"This paper describes a smart camera system under development at Princeton University. This smart camera is designed for use in a smart room in which the camera detects the presence of a person in its visual field and determines when various gestures are made by the person. As a first step toward a VLSI implementation, we use Trimedia processors hosted by a PC. This paper describes the relationship between the algorithms used for human activity detection and the architectures required to perform these tasks in real time.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"278 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122125596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-26DOI: 10.1109/SIPS.2001.957330
V. Nikolajevic, Gerhard Fettweis
Forward and inverse MDCT (modified discrete cosine transform) are two of the most computationally intensive operations in the MPEG audio coding standard. We derive sinusoidal recursive formulas for transforming kernels of the MDCT and IMDCT. Then we efficiently implement general length MDCT and IMDCT using the regressive structure derived from the sinusoidal recursive formulas. The proposed regular structure is particularly suitable for parallel VLSI realization. Our solution requires less hardware than a recently proposed one.
{"title":"New recursive algorithms for the forward and inverse MDCT","authors":"V. Nikolajevic, Gerhard Fettweis","doi":"10.1109/SIPS.2001.957330","DOIUrl":"https://doi.org/10.1109/SIPS.2001.957330","url":null,"abstract":"Forward and inverse MDCT (modified discrete cosine transform) are two of the most computationally intensive operations in the MPEG audio coding standard. We derive sinusoidal recursive formulas for transforming kernels of the MDCT and IMDCT. Then we efficiently implement general length MDCT and IMDCT using the regressive structure derived from the sinusoidal recursive formulas. The proposed regular structure is particularly suitable for parallel VLSI realization. Our solution requires less hardware than a recently proposed one.","PeriodicalId":246898,"journal":{"name":"2001 IEEE Workshop on Signal Processing Systems. SiPS 2001. Design and Implementation (Cat. No.01TH8578)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121337279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}