Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514606
M. Liou
Motion estimation is the most computationally intensive component of a video coding algorithm. It could consume as much as 75% of the total processing power of a video codec. Although various international video coding standards have adopted the block-matching approach for motion estimation, a designer can still have the freedom to choose a specific technique to find the motion vectors. Consequently, numerous techniques ranging from full search to simple fast search algorithms have been proposed and some of them have been implemented in VLSI chips. It should be emphasized that one of the most important requirements for an effective motion estimation algorithm is its ability to perform in real-time. The choice of a specific algorithm for implementation, however, depends very much on the intended application as well as the trade-offs between the desired performance and affordable complexity. In this paper, we first discuss what are the essential ingredients for an effective block-matching motion estimation, then briefly, describe what is the status of current technology, and finally, present some new activities in this vital area of research.
{"title":"Algorithms and VLSI implementation for block-matching motion estimation","authors":"M. Liou","doi":"10.1109/APCCAS.1994.514606","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514606","url":null,"abstract":"Motion estimation is the most computationally intensive component of a video coding algorithm. It could consume as much as 75% of the total processing power of a video codec. Although various international video coding standards have adopted the block-matching approach for motion estimation, a designer can still have the freedom to choose a specific technique to find the motion vectors. Consequently, numerous techniques ranging from full search to simple fast search algorithms have been proposed and some of them have been implemented in VLSI chips. It should be emphasized that one of the most important requirements for an effective motion estimation algorithm is its ability to perform in real-time. The choice of a specific algorithm for implementation, however, depends very much on the intended application as well as the trade-offs between the desired performance and affordable complexity. In this paper, we first discuss what are the essential ingredients for an effective block-matching motion estimation, then briefly, describe what is the status of current technology, and finally, present some new activities in this vital area of research.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115313376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514608
Kyoung-Son Jhang, S. Ha, C. Jhon
The inter-wire spacing in a VLSI chip becomes closer as the VLSI fabrication technology rapidly evolves. Accordingly, it becomes important to consider crosstalk caused by the coupling capacitance between adjacent wires in the layout design for the fast and safe VLSI circuits. The upper bounds of allowable crosstalks, called crosstalk constraint, are usually given for each net in the design specification. This paper presents a segment rearrangement approach to channel routing to satisfy all the crosstalk constraints. Starting from the given routing, the proposed technique repeatedly rearranges the horizontal wire segments around the nets that violate the crosstalk constraints to reduce crosstalk. Our objective is to find a routing with the minimum number of tracks under crosstalk constraints. With experiments, we observed the presented technique is more effective than the track permutation technique.
{"title":"A segment rearrangement approach to channel routing under the crosstalk constraints","authors":"Kyoung-Son Jhang, S. Ha, C. Jhon","doi":"10.1109/APCCAS.1994.514608","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514608","url":null,"abstract":"The inter-wire spacing in a VLSI chip becomes closer as the VLSI fabrication technology rapidly evolves. Accordingly, it becomes important to consider crosstalk caused by the coupling capacitance between adjacent wires in the layout design for the fast and safe VLSI circuits. The upper bounds of allowable crosstalks, called crosstalk constraint, are usually given for each net in the design specification. This paper presents a segment rearrangement approach to channel routing to satisfy all the crosstalk constraints. Starting from the given routing, the proposed technique repeatedly rearranges the horizontal wire segments around the nets that violate the crosstalk constraints to reduce crosstalk. Our objective is to find a routing with the minimum number of tracks under crosstalk constraints. With experiments, we observed the presented technique is more effective than the track permutation technique.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130961595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514553
Chengshan Xiao
In this paper, a new sensitivity measure for state-space digital filters is proposed. The new measure is a modified form of Tavsanoglu and Thiele's measure, but the new one has three advantages: (1) it is more precise than Tavsanoglu and Thiele's measure when the state-space realizations of a digital filter contain 0 and /spl plusmn/1 coefficients; (2) it can be used to explain why the sparse Schur and Hessenberg realizations will give a better actual sensitivity performance than the corresponding fully parametrized optimal realizations; (3) it is equal to the global roundoff noise gain G=tr(QW)+/spl gamma/ obtained by Hwang under the dynamic constraint.
{"title":"An alternative sensitivity for state-space digital filters","authors":"Chengshan Xiao","doi":"10.1109/APCCAS.1994.514553","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514553","url":null,"abstract":"In this paper, a new sensitivity measure for state-space digital filters is proposed. The new measure is a modified form of Tavsanoglu and Thiele's measure, but the new one has three advantages: (1) it is more precise than Tavsanoglu and Thiele's measure when the state-space realizations of a digital filter contain 0 and /spl plusmn/1 coefficients; (2) it can be used to explain why the sparse Schur and Hessenberg realizations will give a better actual sensitivity performance than the corresponding fully parametrized optimal realizations; (3) it is equal to the global roundoff noise gain G=tr(QW)+/spl gamma/ obtained by Hwang under the dynamic constraint.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123273462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514548
Wen-Ta Lee, Thou-Ho Chen, Liang-Gee Chen
In this paper, we present a radix-2/sup k/ Viterbi decoding with Transpose Path Metric (TPM) processor. The TPM processor can provide a permutation function for state rearrangement with simple local interconnection. For interconnection realization, the routing complexity is less than that of the delay-commutator reported previously. In addition, a higher memory length Viterbi processor can be constructed with lower radix-2/sup k/ modules. With features of modulation and cell regularity, the radix-2/sup k/ Viterbi decoding with TPM processor is very suitable for VLSI implementation.
{"title":"The radix-2/sup k/ Viterbi decoding with transpose path metric processor","authors":"Wen-Ta Lee, Thou-Ho Chen, Liang-Gee Chen","doi":"10.1109/APCCAS.1994.514548","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514548","url":null,"abstract":"In this paper, we present a radix-2/sup k/ Viterbi decoding with Transpose Path Metric (TPM) processor. The TPM processor can provide a permutation function for state rearrangement with simple local interconnection. For interconnection realization, the routing complexity is less than that of the delay-commutator reported previously. In addition, a higher memory length Viterbi processor can be constructed with lower radix-2/sup k/ modules. With features of modulation and cell regularity, the radix-2/sup k/ Viterbi decoding with TPM processor is very suitable for VLSI implementation.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"153 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116912995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514543
K. Ito, K. K. Parhi
Iterative digital signal processing algorithms are described by iterative data-flow graphs where nodes represent computations and edges represent communications. In this paper we propose a novel method to determine the iteration bound, which is the fundamental lower bound of the iteration period of a processing algorithm, by using the minimum cycle mean algorithm to achieve a lower polynomial time complexity than existing methods. It is convenient to represent many multi-rate signal processing algorithms by multirate data-flow graphs. The iteration bound of a multi-rate dataflow graph (MRDFG) can be determined as the iteration bound of the single-rate data-flow graph (SRDFG) equivalent of the MRDFG. We present an approach to eliminate node redundancy in the equivalent SRDFG for faster determination of the iteration bound of an MRDFG.
{"title":"Determining the iteration bounds of single-rate and multi-rate data-flow graphs","authors":"K. Ito, K. K. Parhi","doi":"10.1109/APCCAS.1994.514543","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514543","url":null,"abstract":"Iterative digital signal processing algorithms are described by iterative data-flow graphs where nodes represent computations and edges represent communications. In this paper we propose a novel method to determine the iteration bound, which is the fundamental lower bound of the iteration period of a processing algorithm, by using the minimum cycle mean algorithm to achieve a lower polynomial time complexity than existing methods. It is convenient to represent many multi-rate signal processing algorithms by multirate data-flow graphs. The iteration bound of a multi-rate dataflow graph (MRDFG) can be determined as the iteration bound of the single-rate data-flow graph (SRDFG) equivalent of the MRDFG. We present an approach to eliminate node redundancy in the equivalent SRDFG for faster determination of the iteration bound of an MRDFG.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121915826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514604
Jyh-Shing Shyuu, Jhing-Fa Wang, Chung-Hsien Wu
A decomposition of signal into a set of frequency channels of equal bandwidth on a logarithmic scale, i.e., an analysis of the signal using constant Q filters, using wavelet and multiresolution analysis is used in this paper to derive cepstrum features of different spatial frequency bands. Based on the decompositions, each channel is modeled as a Bayesian subnetwork and each subnetwork is weighted by a weighting algorithm. The distortions for speech recognition between a reference model and the input vectors are then computed by summing the weighted scores of all decomposed channels. The experimental results show that the recognition rate of this method is superior to those non-weighting methods.
{"title":"A channel-weighting method for speech recognition using wavelet decompositions","authors":"Jyh-Shing Shyuu, Jhing-Fa Wang, Chung-Hsien Wu","doi":"10.1109/APCCAS.1994.514604","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514604","url":null,"abstract":"A decomposition of signal into a set of frequency channels of equal bandwidth on a logarithmic scale, i.e., an analysis of the signal using constant Q filters, using wavelet and multiresolution analysis is used in this paper to derive cepstrum features of different spatial frequency bands. Based on the decompositions, each channel is modeled as a Bayesian subnetwork and each subnetwork is weighted by a weighting algorithm. The distortions for speech recognition between a reference model and the input vectors are then computed by summing the weighted scores of all decomposed channels. The experimental results show that the recognition rate of this method is superior to those non-weighting methods.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117285458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514527
Chin-Liang Wang, Chang-Yu Chen
In this paper, we propose a linear systolic array of N basic cells (including 2N multipliers) for computing the two-dimensional (2-D) N/spl times/N-point discrete cosine transform (DCT). The array is based on the row-column decomposition but involves no matrix transposition problems. The proposed architecture is highly regular, modular, and thus very suitable for VLSI implementation. Also, it has an efficiency of 100 percent and a throughput of one N/spl times/N-point transform per N/sup 2/ cycles. As compared to existing array structures for the 2-D DCT, the proposed one achieves lower or the same area-time complexity with better regularity. Without change in circuit design, it can be directly used to compute the 2-D N/spl times/N-point inverse DCT and other discrete sinusoidal transforms, such as the discrete sine transform and the discrete Hartley transform. By using the GENESIL CAD tool we design a prototype chip of the proposed linear array for the 8/spl times/8-point DCT in a 0.8 /spl mu/m CMOS technology. The chip requires a die size of about 6.95 mm/spl times/6.9 mm (including 108363 transistors) and is able to operate at a clock rate up to 33 MHz.
{"title":"A linear systolic array for the 2-D discrete cosine transform","authors":"Chin-Liang Wang, Chang-Yu Chen","doi":"10.1109/APCCAS.1994.514527","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514527","url":null,"abstract":"In this paper, we propose a linear systolic array of N basic cells (including 2N multipliers) for computing the two-dimensional (2-D) N/spl times/N-point discrete cosine transform (DCT). The array is based on the row-column decomposition but involves no matrix transposition problems. The proposed architecture is highly regular, modular, and thus very suitable for VLSI implementation. Also, it has an efficiency of 100 percent and a throughput of one N/spl times/N-point transform per N/sup 2/ cycles. As compared to existing array structures for the 2-D DCT, the proposed one achieves lower or the same area-time complexity with better regularity. Without change in circuit design, it can be directly used to compute the 2-D N/spl times/N-point inverse DCT and other discrete sinusoidal transforms, such as the discrete sine transform and the discrete Hartley transform. By using the GENESIL CAD tool we design a prototype chip of the proposed linear array for the 8/spl times/8-point DCT in a 0.8 /spl mu/m CMOS technology. The chip requires a die size of about 6.95 mm/spl times/6.9 mm (including 108363 transistors) and is able to operate at a clock rate up to 33 MHz.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129609297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514541
J. Vilasdechanon, K. Likit-Anurucks, N. Sugino, A. Nishihara
A computational ordering and code generation method for DSP compiler, which take the target DSP architecture, i.e., the number of accumulators, bus structure and multi-operation instruction code, into consideration is proposed. By combining computation ordering and code generation into one step, a better outcome code of DSP compiler may be obtained. Although only /spl mu/PD 7720 is used as target DSP in this work, the method may be applied to other DSPs.
{"title":"Architecture driven computational ordering and code generation method for DSP compiler","authors":"J. Vilasdechanon, K. Likit-Anurucks, N. Sugino, A. Nishihara","doi":"10.1109/APCCAS.1994.514541","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514541","url":null,"abstract":"A computational ordering and code generation method for DSP compiler, which take the target DSP architecture, i.e., the number of accumulators, bus structure and multi-operation instruction code, into consideration is proposed. By combining computation ordering and code generation into one step, a better outcome code of DSP compiler may be obtained. Although only /spl mu/PD 7720 is used as target DSP in this work, the method may be applied to other DSPs.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127602329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514594
Tsutomu Sasao, D. Dednath
A generalized Reed-Muller expression (GRM) is obtained by negating some of the literals in a positive polarity Reed-Muller expression (PPRM). There are at most 2/sup n2(n-1)/ different GRMs for an n-variable function. A minimum GRM is one with the fewest products. This paper presents certain properties and an exact minimization algorithm for GRMs. The minimization algorithm uses binary decision diagrams. Up to five variables, all the representative functions of NP-equivalence classes were generated, and minimized. A table compares the number of products necessary to represent 5-variable functions for 7 classes of expressions: FPRMs, KROs, PSDRMs, PSD-KROs, GRMs, ESOPs, and SOPs. GRMs require, on the average, fewer products than sum-of-products expressions and have easily testable realizations.
{"title":"An exact minimization algorithm for generalized Reed-Muller expressions","authors":"Tsutomu Sasao, D. Dednath","doi":"10.1109/APCCAS.1994.514594","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514594","url":null,"abstract":"A generalized Reed-Muller expression (GRM) is obtained by negating some of the literals in a positive polarity Reed-Muller expression (PPRM). There are at most 2/sup n2(n-1)/ different GRMs for an n-variable function. A minimum GRM is one with the fewest products. This paper presents certain properties and an exact minimization algorithm for GRMs. The minimization algorithm uses binary decision diagrams. Up to five variables, all the representative functions of NP-equivalence classes were generated, and minimized. A table compares the number of products necessary to represent 5-variable functions for 7 classes of expressions: FPRMs, KROs, PSDRMs, PSD-KROs, GRMs, ESOPs, and SOPs. GRMs require, on the average, fewer products than sum-of-products expressions and have easily testable realizations.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121004406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1994-12-05DOI: 10.1109/APCCAS.1994.514623
Imgeun Lee, Jongsik Kim, Yougkyu Kim, Seongman Kim, Gooman Park, Kyu Tae Park
Human visual system (HVS) is applied to wavelet transform in this paper. Wavelet transform decomposes the spatial frequency domain to octave band scale and this process is similar to that of HVS. So the wavelet transform and modulation transfer function (MTF) of HVS can be optimally coupled in order to gain coding efficiency. Several quantizing stepsizes are determined for each subband by integral ratio which is from frequency response of wavelet filter and MTF. Compared with JPEG, the result shows superiority.
{"title":"Wavelet transform image coding using human visual system","authors":"Imgeun Lee, Jongsik Kim, Yougkyu Kim, Seongman Kim, Gooman Park, Kyu Tae Park","doi":"10.1109/APCCAS.1994.514623","DOIUrl":"https://doi.org/10.1109/APCCAS.1994.514623","url":null,"abstract":"Human visual system (HVS) is applied to wavelet transform in this paper. Wavelet transform decomposes the spatial frequency domain to octave band scale and this process is similar to that of HVS. So the wavelet transform and modulation transfer function (MTF) of HVS can be optimally coupled in order to gain coding efficiency. Several quantizing stepsizes are determined for each subband by integral ratio which is from frequency response of wavelet filter and MTF. Compared with JPEG, the result shows superiority.","PeriodicalId":231368,"journal":{"name":"Proceedings of APCCAS'94 - 1994 Asia Pacific Conference on Circuits and Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126321929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}