Accurate cell nucleus segmentation is necessary for automated cytological image analysis. Thresholding is a crucial step in segmentation. The accuracy of segmentation depends on the accuracy of thresholding. In this paper we propose a new method for thresholding of photomicrographs of diversly stained cytology smears. To account for the different stains, we use different color spaces. A new local thresholding scheme is developed to solve the problem of nonuniform staining. Finally, the results obtained from the new method are compared with those of some of the existing thresholding methods, clearly showing the improvement achieved.
{"title":"Adaptive local thresholding for detection of nuclei in diversity stained cytology images","authors":"Neerad Phansalkar, Sumit More, Ashish Sabale, Madhuri Joshi","doi":"10.1109/ICCSP.2011.5739305","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739305","url":null,"abstract":"Accurate cell nucleus segmentation is necessary for automated cytological image analysis. Thresholding is a crucial step in segmentation. The accuracy of segmentation depends on the accuracy of thresholding. In this paper we propose a new method for thresholding of photomicrographs of diversly stained cytology smears. To account for the different stains, we use different color spaces. A new local thresholding scheme is developed to solve the problem of nonuniform staining. Finally, the results obtained from the new method are compared with those of some of the existing thresholding methods, clearly showing the improvement achieved.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125486469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739352
Paramjeet Singh, A. K. Verma
The combined Quasi Static Spectral Domain Approach (SDA) method and Single Layer Reduction (SLR) technique is presented to compute dielectric loss of multilayer Coplanar Waveguide (CPW). The Green's function for the multilayer structure is derived from Transverse Transmission Line (TTL) method. Quasi static SDA method is used to compute effective relative permittivity of the multilayer CPW. The Single Layer Reduction (SLR) technique converts multilayer CPW structure to an equivalent single layer CPW structure. The dielectric loss is computed for the equivalent CPW structure.
{"title":"Dielectric loss computation of multilayer Coplanar Waveguide","authors":"Paramjeet Singh, A. K. Verma","doi":"10.1109/ICCSP.2011.5739352","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739352","url":null,"abstract":"The combined Quasi Static Spectral Domain Approach (SDA) method and Single Layer Reduction (SLR) technique is presented to compute dielectric loss of multilayer Coplanar Waveguide (CPW). The Green's function for the multilayer structure is derived from Transverse Transmission Line (TTL) method. Quasi static SDA method is used to compute effective relative permittivity of the multilayer CPW. The Single Layer Reduction (SLR) technique converts multilayer CPW structure to an equivalent single layer CPW structure. The dielectric loss is computed for the equivalent CPW structure.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132668601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739310
Sajan Goud, M. Jacob
We recently proposed an accelerated dynamic magnetic resonance imaging (MRI) reconstruction algorithm that exploits the underlying low rank and sparse properties of the data to achieve highly accelerated reconstructions. In this paper, we validate our algorithm in the context of dynamic free breathing cardiac Perfusion MRI on the Physiologically Improved Non Uniform Cardiac Torso Phantom, PINCAT phantom. The practical utilities of our scheme in providing significantly better reconstructions at higher accelerations in comparison to existing methods are studied. We demonstrate that our scheme do not have trade offs with accurate temporal modeling and spatial quality unlike the existing low rank based schemes. Our results also show the capability of our scheme to achieve better reconstruction qualities at high accelerations in comparison to using only the low rank or sparsity properties individually. We argue that the speed up obtained by our scheme could be capitalized in perfusion imaging to provide better spatio-temporal resolutions and volume coverage while the subject is freely breathing.
{"title":"Free breathing cardiac perfusion MRI reconstruction using a sparse and low rank model: Validation with the Physiologically Improved NCAT phantom","authors":"Sajan Goud, M. Jacob","doi":"10.1109/ICCSP.2011.5739310","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739310","url":null,"abstract":"We recently proposed an accelerated dynamic magnetic resonance imaging (MRI) reconstruction algorithm that exploits the underlying low rank and sparse properties of the data to achieve highly accelerated reconstructions. In this paper, we validate our algorithm in the context of dynamic free breathing cardiac Perfusion MRI on the Physiologically Improved Non Uniform Cardiac Torso Phantom, PINCAT phantom. The practical utilities of our scheme in providing significantly better reconstructions at higher accelerations in comparison to existing methods are studied. We demonstrate that our scheme do not have trade offs with accurate temporal modeling and spatial quality unlike the existing low rank based schemes. Our results also show the capability of our scheme to achieve better reconstruction qualities at high accelerations in comparison to using only the low rank or sparsity properties individually. We argue that the speed up obtained by our scheme could be capitalized in perfusion imaging to provide better spatio-temporal resolutions and volume coverage while the subject is freely breathing.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134043737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739315
Vivek Dhoot, Sanjeev Gupta
In this paper, a novel multifractal cantor based multiband monopole antenna is proposed and analyzed using 3-Dimensional Finite Difference Time Domain Method (3D-FDTD). The proposed antenna has multiband characteristics covering several wireless applications in Ultra Wideband (UWB) including WLAN 2.4 GHz and 5.8 GHz, GSM, PCS and DCS applications. A program based on 3D-FDTD method is written and utilized for observing return loss of the proposed antenna.
{"title":"Full wave analysis of a novel multifractal multiband antenna using 3D-FDTD approach","authors":"Vivek Dhoot, Sanjeev Gupta","doi":"10.1109/ICCSP.2011.5739315","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739315","url":null,"abstract":"In this paper, a novel multifractal cantor based multiband monopole antenna is proposed and analyzed using 3-Dimensional Finite Difference Time Domain Method (3D-FDTD). The proposed antenna has multiband characteristics covering several wireless applications in Ultra Wideband (UWB) including WLAN 2.4 GHz and 5.8 GHz, GSM, PCS and DCS applications. A program based on 3D-FDTD method is written and utilized for observing return loss of the proposed antenna.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132534471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739371
Sunil P. Joshi, R. Paily
The use of error-correcting codes has proven to be an effective way to overcome data corruption in digital wireless communication channels, enabling reliable transmission over noisy and fading channel. This requires low power decoders as they consume lot of power. Power reduction in any system can be achieved at device level, at circuit level or at architectural level. In this paper, power reduction is achieved at architecture level. A Viterbi Decoder (VD) with architectural modification for Add-Compare-Select Unit (ACSU) and clock gated Survivor Memory Unit (SMU) are designed for low power wireless applications. A decoder system with code rate of k/n=1/2 with constraint length K=7 has been implemented with 130nm technology. It is synthesized using design compiler of Synopsys and its power is estimated with power compiler. A throughput of 125 Mbps is achieved satisfying the requirement for wireless applications. Bit error rate of proposed system is same as that of modified register exchange VD. Around 66% power is reduced with clock gating technique.
{"title":"Low power Viterbi Decoder by modified ACSU architecture and clock gating method","authors":"Sunil P. Joshi, R. Paily","doi":"10.1109/ICCSP.2011.5739371","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739371","url":null,"abstract":"The use of error-correcting codes has proven to be an effective way to overcome data corruption in digital wireless communication channels, enabling reliable transmission over noisy and fading channel. This requires low power decoders as they consume lot of power. Power reduction in any system can be achieved at device level, at circuit level or at architectural level. In this paper, power reduction is achieved at architecture level. A Viterbi Decoder (VD) with architectural modification for Add-Compare-Select Unit (ACSU) and clock gated Survivor Memory Unit (SMU) are designed for low power wireless applications. A decoder system with code rate of k/n=1/2 with constraint length K=7 has been implemented with 130nm technology. It is synthesized using design compiler of Synopsys and its power is estimated with power compiler. A throughput of 125 Mbps is achieved satisfying the requirement for wireless applications. Bit error rate of proposed system is same as that of modified register exchange VD. Around 66% power is reduced with clock gating technique.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"27 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132900938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739348
D. Murali Mohan, Dileep B. Karpur, M. Narayan, J. Kishore
Spectrum of speech signals have frequency components from 50Hz to 7 kHz (Wideband speech). However, due to historical reasons speech is band-pass filtered between 300 Hz-3.4 kHz in PSTN networks and this speech is referred to as narrowband speech. The missing bandwidth in narrow band speech contributes to speech quality and intelligibility. This paper addresses the problem of artificial bandwidth extension of narrowband speech to wideband speech. The proposed method for bandwidth extension is based on statistical recovery using Gaussian Mixture Model (GMM) for spectral envelope parameters and spectral shifting method is used for excitation extension.
{"title":"Artificial bandwidth extension of narrowband speech using Gaussian Mixture Model","authors":"D. Murali Mohan, Dileep B. Karpur, M. Narayan, J. Kishore","doi":"10.1109/ICCSP.2011.5739348","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739348","url":null,"abstract":"Spectrum of speech signals have frequency components from 50Hz to 7 kHz (Wideband speech). However, due to historical reasons speech is band-pass filtered between 300 Hz-3.4 kHz in PSTN networks and this speech is referred to as narrowband speech. The missing bandwidth in narrow band speech contributes to speech quality and intelligibility. This paper addresses the problem of artificial bandwidth extension of narrowband speech to wideband speech. The proposed method for bandwidth extension is based on statistical recovery using Gaussian Mixture Model (GMM) for spectral envelope parameters and spectral shifting method is used for excitation extension.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134088620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739299
S. Shelke, S. Apte
This paper presents a novel approach for recognition of unconstrained handwritten Marathi compound characters. The recognition is carried out using multistage feature extraction and classification scheme. The initial stages of feature extraction are based upon the structural features and the classification of the characters is done according to their parameters. The final stage of feature extraction employs generation of kernels using Wavelet transform. A single level Wavelet decomposition is used to generate the approximation coefficients. These coefficients are stored as kernels for matching. A modified wavelet based kernel generation method is also implemented. The recognition is done by template matching in both the cases. The results are analyzed using both the kernel generation techniques for varying resize factors. The recognition rate achieved from the proposed method is 95.89% and 96.00% for 16×16 and 32×32 resize factors respectively with wavelet based kernels and 96.41% and 97.94% for 16×16 and 32×32 resize factors respectively with modified wavelet based kernels.
{"title":"A novel multistage classification and Wavelet based kernel generation for handwritten Marathi compound character recognition","authors":"S. Shelke, S. Apte","doi":"10.1109/ICCSP.2011.5739299","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739299","url":null,"abstract":"This paper presents a novel approach for recognition of unconstrained handwritten Marathi compound characters. The recognition is carried out using multistage feature extraction and classification scheme. The initial stages of feature extraction are based upon the structural features and the classification of the characters is done according to their parameters. The final stage of feature extraction employs generation of kernels using Wavelet transform. A single level Wavelet decomposition is used to generate the approximation coefficients. These coefficients are stored as kernels for matching. A modified wavelet based kernel generation method is also implemented. The recognition is done by template matching in both the cases. The results are analyzed using both the kernel generation techniques for varying resize factors. The recognition rate achieved from the proposed method is 95.89% and 96.00% for 16×16 and 32×32 resize factors respectively with wavelet based kernels and 96.41% and 97.94% for 16×16 and 32×32 resize factors respectively with modified wavelet based kernels.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123777129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739363
M. Sangeetha, V. Bhaskar
Wideband code division multiple access (W-CDMA) is a third generation (3G) mobile wireless technology that promises much higher data speeds for mobile and portable wireless devices than those which are commonly offered in today's market. In W-CDMA systems, while transmitting information over multipath channels, both intersymbol interference (ISI) as a result of interchip interference (ICI) and multiple access interference (MAI) cannot be easily eliminated. Although it is possible to design multiuser detectors that suppress MAI and ISI, these detectors often require explicit knowledge of at least the desired users' signature waveform. Torlak and Xu proposed a blind estimation algorithm for Asynchronous CDMA (A-CDMA) systems to estimate the multiple users' symbols [8]. In our work, we study a similar blind channel estimation scheme for downlink W-CDMA systems that provide estimates of the subchannels of multiple users by exploiting the structural information of the data output. In particular, we show that the subspace of the (data+noise) matrix contains sufficient information for unique determination of channels, and hence, the signature waveforms and signal constellation. Performance measures like bit error rate and root mean square error are plotted for various channel fading conditions.
宽带码分多址(W-CDMA)是第三代(3G)移动无线技术,它承诺为移动和便携式无线设备提供比当今市场上通常提供的更高的数据传输速度。在W-CDMA系统中,在多径信道上传输信息时,由于片间干扰(ICI)和多址干扰(MAI)而产生的码间干扰(ISI)难以消除。虽然可以设计出抑制MAI和ISI的多用户检测器,但这些检测器通常需要至少明确了解所需用户的签名波形。Torlak和Xu提出了一种用于异步CDMA (a -CDMA)系统的盲估计算法来估计多个用户的符号[8]。在我们的工作中,我们研究了一种类似的下行W-CDMA系统的盲信道估计方案,该方案通过利用数据输出的结构信息来提供多用户子信道的估计。特别是,我们表明(数据+噪声)矩阵的子空间包含足够的信息来唯一确定信道,从而确定签名波形和信号星座。在各种信道衰落条件下,绘制了误码率和均方根误差等性能指标。
{"title":"Downlink blind channel estimation for W-CDMA systems and performance analysis under various fading channel conditions","authors":"M. Sangeetha, V. Bhaskar","doi":"10.1109/ICCSP.2011.5739363","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739363","url":null,"abstract":"Wideband code division multiple access (W-CDMA) is a third generation (3G) mobile wireless technology that promises much higher data speeds for mobile and portable wireless devices than those which are commonly offered in today's market. In W-CDMA systems, while transmitting information over multipath channels, both intersymbol interference (ISI) as a result of interchip interference (ICI) and multiple access interference (MAI) cannot be easily eliminated. Although it is possible to design multiuser detectors that suppress MAI and ISI, these detectors often require explicit knowledge of at least the desired users' signature waveform. Torlak and Xu proposed a blind estimation algorithm for Asynchronous CDMA (A-CDMA) systems to estimate the multiple users' symbols [8]. In our work, we study a similar blind channel estimation scheme for downlink W-CDMA systems that provide estimates of the subchannels of multiple users by exploiting the structural information of the data output. In particular, we show that the subspace of the (data+noise) matrix contains sufficient information for unique determination of channels, and hence, the signature waveforms and signal constellation. Performance measures like bit error rate and root mean square error are plotted for various channel fading conditions.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129821287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739361
P. Shanmugapriya, Y. Venkataramani
A Fuzzy Wavelet network (FWN) is proposed to model the characteristics of a speaker in an automatic speaker verification system in this paper. The neural network using wavelet as activation function is wavelet network (Wavenet). Wavenet has the ability to extract the distinguishable and essential features in frequency rich signals. This is required in classification and identification problems such as speaker verification. Nonlinearity and structured knowledge representation with human perception of fuzzy inference system makes it to be a suitable model for speaker verification when combined with the wavelet network. In this approach, the wavelet theory is combined with the fuzzy based neural network theory which leads to construction of Fuzzy Wavelet Network (FWN). The advantage of fuzzy wavelet network is that the membership functions can be easily merged or divided using the multi resolution properties and the rules can be evaluated during learning. The performance of the proposed speaker verification system is evaluated with TIMIT database. A comparison is made between the proposed system and the system using state of the art model (GMM). Compared with GMM and WNN, FWN provides better verification performance.
{"title":"Implementation of speaker verification system using Fuzzy Wavelet Network","authors":"P. Shanmugapriya, Y. Venkataramani","doi":"10.1109/ICCSP.2011.5739361","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739361","url":null,"abstract":"A Fuzzy Wavelet network (FWN) is proposed to model the characteristics of a speaker in an automatic speaker verification system in this paper. The neural network using wavelet as activation function is wavelet network (Wavenet). Wavenet has the ability to extract the distinguishable and essential features in frequency rich signals. This is required in classification and identification problems such as speaker verification. Nonlinearity and structured knowledge representation with human perception of fuzzy inference system makes it to be a suitable model for speaker verification when combined with the wavelet network. In this approach, the wavelet theory is combined with the fuzzy based neural network theory which leads to construction of Fuzzy Wavelet Network (FWN). The advantage of fuzzy wavelet network is that the membership functions can be easily merged or divided using the multi resolution properties and the rules can be evaluated during learning. The performance of the proposed speaker verification system is evaluated with TIMIT database. A comparison is made between the proposed system and the system using state of the art model (GMM). Compared with GMM and WNN, FWN provides better verification performance.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125189890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-03-24DOI: 10.1109/ICCSP.2011.5739300
A. Revathi, Y. Venkataramani
The main objective of this paper is to explore the effectiveness of perceptual features for performing isolated digits and continuous speech recognition. The proposed perceptual features are captured and code book indices are extracted. Expectation maximization algorithm is used to generate HMM models for the speeches. Speech recognition system is evaluated on clean test speeches and the experimental results reveal the performance of the proposed algorithm in recognizing isolated digits and continuous speeches based on maximum log likelihood value between test features and HMM models for each speech. Performance of these features is tested on speeches randomly chosen from “TI Digits_1”, “TI Digits_2” and “TIMIT” databases. This algorithm is tested for VQ and combination of VQ and HMM speech modeling techniques. Perceptual linear predictive cepstrum yields the accuracy of 86% and 93% for speaker independent isolated digit recognition using VQ and combination of VQ & HMM speech models respectively. This feature also gives 99% and 100% accuracy for speaker independent continuous speech recognition by using VQ and the combination of VQ & HMM speech modeling techniques.
{"title":"Speaker independent continuous speech and isolated digit recognition using VQ and HMM","authors":"A. Revathi, Y. Venkataramani","doi":"10.1109/ICCSP.2011.5739300","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739300","url":null,"abstract":"The main objective of this paper is to explore the effectiveness of perceptual features for performing isolated digits and continuous speech recognition. The proposed perceptual features are captured and code book indices are extracted. Expectation maximization algorithm is used to generate HMM models for the speeches. Speech recognition system is evaluated on clean test speeches and the experimental results reveal the performance of the proposed algorithm in recognizing isolated digits and continuous speeches based on maximum log likelihood value between test features and HMM models for each speech. Performance of these features is tested on speeches randomly chosen from “TI Digits_1”, “TI Digits_2” and “TIMIT” databases. This algorithm is tested for VQ and combination of VQ and HMM speech modeling techniques. Perceptual linear predictive cepstrum yields the accuracy of 86% and 93% for speaker independent isolated digit recognition using VQ and combination of VQ & HMM speech models respectively. This feature also gives 99% and 100% accuracy for speaker independent continuous speech recognition by using VQ and the combination of VQ & HMM speech modeling techniques.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116278919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}