2011 International Conference on Communications and Signal Processing最新文献

英文中文

Adaptive local thresholding for detection of nuclei in diversity stained cytology images 自适应局部阈值法检测多样性染色细胞学图像中的细胞核

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739305

Neerad Phansalkar, Sumit More, Ashish Sabale, Madhuri Joshi

Accurate cell nucleus segmentation is necessary for automated cytological image analysis. Thresholding is a crucial step in segmentation. The accuracy of segmentation depends on the accuracy of thresholding. In this paper we propose a new method for thresholding of photomicrographs of diversly stained cytology smears. To account for the different stains, we use different color spaces. A new local thresholding scheme is developed to solve the problem of nonuniform staining. Finally, the results obtained from the new method are compared with those of some of the existing thresholding methods, clearly showing the improvement achieved.

准确的细胞核分割是自动化细胞学图像分析的必要条件。阈值分割是分割的关键步骤。分割的准确性取决于阈值的准确性。本文提出了一种对不同染色细胞学涂片显微图像进行阈值分割的新方法。为了解释不同的污渍，我们使用了不同的色彩空间。为了解决非均匀染色问题，提出了一种新的局部阈值分割方法。最后，将新方法得到的结果与现有的一些阈值分割方法的结果进行了比较，清楚地显示了所取得的改进。

引用次数: 256

Dielectric loss computation of multilayer Coplanar Waveguide 多层共面波导介质损耗计算

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739352

Paramjeet Singh, A. K. Verma

The combined Quasi Static Spectral Domain Approach (SDA) method and Single Layer Reduction (SLR) technique is presented to compute dielectric loss of multilayer Coplanar Waveguide (CPW). The Green's function for the multilayer structure is derived from Transverse Transmission Line (TTL) method. Quasi static SDA method is used to compute effective relative permittivity of the multilayer CPW. The Single Layer Reduction (SLR) technique converts multilayer CPW structure to an equivalent single layer CPW structure. The dielectric loss is computed for the equivalent CPW structure.

提出了准静态谱域法(SDA)和单层还原法(SLR)相结合的多层共面波导介质损耗计算方法。多层结构的格林函数是由横向传输线(TTL)法导出的。采用准静态SDA方法计算多层CPW的有效相对介电常数。单层还原(SLR)技术将多层CPW结构转化为等效的单层CPW结构。计算了等效CPW结构的介电损耗。

引用次数: 3

Free breathing cardiac perfusion MRI reconstruction using a sparse and low rank model: Validation with the Physiologically Improved NCAT phantom 使用稀疏和低秩模型的自由呼吸心脏灌注MRI重建:与生理性改进的NCAT幻影验证

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739310

Sajan Goud, M. Jacob

We recently proposed an accelerated dynamic magnetic resonance imaging (MRI) reconstruction algorithm that exploits the underlying low rank and sparse properties of the data to achieve highly accelerated reconstructions. In this paper, we validate our algorithm in the context of dynamic free breathing cardiac Perfusion MRI on the Physiologically Improved Non Uniform Cardiac Torso Phantom, PINCAT phantom. The practical utilities of our scheme in providing significantly better reconstructions at higher accelerations in comparison to existing methods are studied. We demonstrate that our scheme do not have trade offs with accurate temporal modeling and spatial quality unlike the existing low rank based schemes. Our results also show the capability of our scheme to achieve better reconstruction qualities at high accelerations in comparison to using only the low rank or sparsity properties individually. We argue that the speed up obtained by our scheme could be capitalized in perfusion imaging to provide better spatio-temporal resolutions and volume coverage while the subject is freely breathing.

我们最近提出了一种加速的动态磁共振成像(MRI)重建算法，该算法利用数据的底层低秩和稀疏特性来实现高度加速的重建。在本文中，我们在动态自由呼吸心脏灌注MRI的背景下对生理改进的非均匀心脏躯干幻影(PINCAT)幻影验证了我们的算法。研究了与现有方法相比，我们的方案在高加速度下提供更好的重建效果的实际效用。我们证明，与现有的低秩方案不同，我们的方案不需要在精确的时间建模和空间质量之间进行权衡。我们的结果还表明，与单独使用低秩或稀疏性属性相比，我们的方案能够在高加速度下实现更好的重建质量。我们认为，通过我们的方案获得的速度可以在灌注成像中得到利用，以提供更好的时空分辨率和体积覆盖，而受试者是自由呼吸。

{"title":"Free breathing cardiac perfusion MRI reconstruction using a sparse and low rank model: Validation with the Physiologically Improved NCAT phantom","authors":"Sajan Goud, M. Jacob","doi":"10.1109/ICCSP.2011.5739310","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739310","url":null,"abstract":"We recently proposed an accelerated dynamic magnetic resonance imaging (MRI) reconstruction algorithm that exploits the underlying low rank and sparse properties of the data to achieve highly accelerated reconstructions. In this paper, we validate our algorithm in the context of dynamic free breathing cardiac Perfusion MRI on the Physiologically Improved Non Uniform Cardiac Torso Phantom, PINCAT phantom. The practical utilities of our scheme in providing significantly better reconstructions at higher accelerations in comparison to existing methods are studied. We demonstrate that our scheme do not have trade offs with accurate temporal modeling and spatial quality unlike the existing low rank based schemes. Our results also show the capability of our scheme to achieve better reconstruction qualities at high accelerations in comparison to using only the low rank or sparsity properties individually. We argue that the speed up obtained by our scheme could be capitalized in perfusion imaging to provide better spatio-temporal resolutions and volume coverage while the subject is freely breathing.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134043737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Full wave analysis of a novel multifractal multiband antenna using 3D-FDTD approach 一种新型多重分形多波段天线的三维时域有限差分全波分析

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739315

Vivek Dhoot, Sanjeev Gupta

In this paper, a novel multifractal cantor based multiband monopole antenna is proposed and analyzed using 3-Dimensional Finite Difference Time Domain Method (3D-FDTD). The proposed antenna has multiband characteristics covering several wireless applications in Ultra Wideband (UWB) including WLAN 2.4 GHz and 5.8 GHz, GSM, PCS and DCS applications. A program based on 3D-FDTD method is written and utilized for observing return loss of the proposed antenna.

本文提出了一种新的基于多重分形康托的多波段单极天线，并利用三维时域有限差分法(3D-FDTD)对其进行了分析。所提出的天线具有多频段特性，涵盖超宽带(UWB)中的几种无线应用，包括WLAN 2.4 GHz和5.8 GHz, GSM, PCS和DCS应用。编写了基于三维时域有限差分法的程序，用于观测天线的回波损耗。

引用次数: 0

Low power Viterbi Decoder by modified ACSU architecture and clock gating method 采用改进ACSU结构和时钟门控方法的低功耗维特比解码器

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739371

Sunil P. Joshi, R. Paily

The use of error-correcting codes has proven to be an effective way to overcome data corruption in digital wireless communication channels, enabling reliable transmission over noisy and fading channel. This requires low power decoders as they consume lot of power. Power reduction in any system can be achieved at device level, at circuit level or at architectural level. In this paper, power reduction is achieved at architecture level. A Viterbi Decoder (VD) with architectural modification for Add-Compare-Select Unit (ACSU) and clock gated Survivor Memory Unit (SMU) are designed for low power wireless applications. A decoder system with code rate of k/n=1/2 with constraint length K=7 has been implemented with 130nm technology. It is synthesized using design compiler of Synopsys and its power is estimated with power compiler. A throughput of 125 Mbps is achieved satisfying the requirement for wireless applications. Bit error rate of proposed system is same as that of modified register exchange VD. Around 66% power is reduced with clock gating technique.

使用纠错码已被证明是克服数字无线通信信道中数据损坏的有效方法，可以在噪声和衰落信道中实现可靠的传输。这需要低功率解码器，因为它们消耗大量的功率。任何系统的功耗降低都可以在器件级、电路级或架构级实现。在本文中，功耗降低是在体系结构级别实现的。Viterbi解码器(VD)的架构修改为添加比较选择单元(ACSU)和时钟门控幸存者内存单元(SMU)是专为低功耗无线应用。采用130nm技术实现了码率为k/n=1/2、约束长度为k =7的译码系统。利用Synopsys的设计编译器对其进行综合，并利用功率编译器对其功率进行估算。达到了125mbps的吞吐量，满足了无线应用的要求。该系统的误码率与改进的寄存器交换VD相同。时钟门控技术降低了大约66%的功率。

引用次数: 9

Artificial bandwidth extension of narrowband speech using Gaussian Mixture Model 基于高斯混合模型的窄带语音人工带宽扩展

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739348

D. Murali Mohan, Dileep B. Karpur, M. Narayan, J. Kishore

Spectrum of speech signals have frequency components from 50Hz to 7 kHz (Wideband speech). However, due to historical reasons speech is band-pass filtered between 300 Hz-3.4 kHz in PSTN networks and this speech is referred to as narrowband speech. The missing bandwidth in narrow band speech contributes to speech quality and intelligibility. This paper addresses the problem of artificial bandwidth extension of narrowband speech to wideband speech. The proposed method for bandwidth extension is based on statistical recovery using Gaussian Mixture Model (GMM) for spectral envelope parameters and spectral shifting method is used for excitation extension.

语音信号频谱的频率成分从50Hz到7khz(宽带语音)。然而，由于历史原因，在PSTN网络中，语音在300 Hz-3.4 kHz之间是带通滤波的，这种语音被称为窄带语音。窄带语音中的带宽缺失会影响语音质量和清晰度。本文研究了窄带语音到宽带语音的人工带宽扩展问题。提出了基于高斯混合模型(GMM)的频谱包络参数统计恢复的带宽扩展方法，采用谱移法进行激励扩展。

引用次数: 13

A novel multistage classification and Wavelet based kernel generation for handwritten Marathi compound character recognition 一种基于多阶段分类和小波核生成的手写马拉地语复合字识别方法

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739299

S. Shelke, S. Apte

This paper presents a novel approach for recognition of unconstrained handwritten Marathi compound characters. The recognition is carried out using multistage feature extraction and classification scheme. The initial stages of feature extraction are based upon the structural features and the classification of the characters is done according to their parameters. The final stage of feature extraction employs generation of kernels using Wavelet transform. A single level Wavelet decomposition is used to generate the approximation coefficients. These coefficients are stored as kernels for matching. A modified wavelet based kernel generation method is also implemented. The recognition is done by template matching in both the cases. The results are analyzed using both the kernel generation techniques for varying resize factors. The recognition rate achieved from the proposed method is 95.89% and 96.00% for 16×16 and 32×32 resize factors respectively with wavelet based kernels and 96.41% and 97.94% for 16×16 and 32×32 resize factors respectively with modified wavelet based kernels.

提出了一种无约束手写马拉地语复合字识别的新方法。采用多阶段特征提取和分类方案进行识别。特征提取的初始阶段是基于结构特征，并根据其参数对字符进行分类。特征提取的最后阶段采用小波变换生成核。采用单级小波分解生成近似系数。这些系数被存储为核，以便进行匹配。实现了一种改进的小波核生成方法。在这两种情况下，识别都是通过模板匹配完成的。对于不同的调整大小因素，使用两种核生成技术对结果进行了分析。该方法对基于小波核的16×16和32×32调整因子的识别率分别为95.89%和96.00%，对基于改进小波核的16×16和32×32调整因子的识别率分别为96.41%和97.94%。

{"title":"A novel multistage classification and Wavelet based kernel generation for handwritten Marathi compound character recognition","authors":"S. Shelke, S. Apte","doi":"10.1109/ICCSP.2011.5739299","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739299","url":null,"abstract":"This paper presents a novel approach for recognition of unconstrained handwritten Marathi compound characters. The recognition is carried out using multistage feature extraction and classification scheme. The initial stages of feature extraction are based upon the structural features and the classification of the characters is done according to their parameters. The final stage of feature extraction employs generation of kernels using Wavelet transform. A single level Wavelet decomposition is used to generate the approximation coefficients. These coefficients are stored as kernels for matching. A modified wavelet based kernel generation method is also implemented. The recognition is done by template matching in both the cases. The results are analyzed using both the kernel generation techniques for varying resize factors. The recognition rate achieved from the proposed method is 95.89% and 96.00% for 16×16 and 32×32 resize factors respectively with wavelet based kernels and 96.41% and 97.94% for 16×16 and 32×32 resize factors respectively with modified wavelet based kernels.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123777129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Downlink blind channel estimation for W-CDMA systems and performance analysis under various fading channel conditions W-CDMA系统的下行盲信道估计及各种衰落信道条件下的性能分析

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739363

M. Sangeetha, V. Bhaskar

Wideband code division multiple access (W-CDMA) is a third generation (3G) mobile wireless technology that promises much higher data speeds for mobile and portable wireless devices than those which are commonly offered in today's market. In W-CDMA systems, while transmitting information over multipath channels, both intersymbol interference (ISI) as a result of interchip interference (ICI) and multiple access interference (MAI) cannot be easily eliminated. Although it is possible to design multiuser detectors that suppress MAI and ISI, these detectors often require explicit knowledge of at least the desired users' signature waveform. Torlak and Xu proposed a blind estimation algorithm for Asynchronous CDMA (A-CDMA) systems to estimate the multiple users' symbols [8]. In our work, we study a similar blind channel estimation scheme for downlink W-CDMA systems that provide estimates of the subchannels of multiple users by exploiting the structural information of the data output. In particular, we show that the subspace of the (data+noise) matrix contains sufficient information for unique determination of channels, and hence, the signature waveforms and signal constellation. Performance measures like bit error rate and root mean square error are plotted for various channel fading conditions.

宽带码分多址(W-CDMA)是第三代(3G)移动无线技术，它承诺为移动和便携式无线设备提供比当今市场上通常提供的更高的数据传输速度。在W-CDMA系统中，在多径信道上传输信息时，由于片间干扰(ICI)和多址干扰(MAI)而产生的码间干扰(ISI)难以消除。虽然可以设计出抑制MAI和ISI的多用户检测器，但这些检测器通常需要至少明确了解所需用户的签名波形。Torlak和Xu提出了一种用于异步CDMA (a -CDMA)系统的盲估计算法来估计多个用户的符号[8]。在我们的工作中，我们研究了一种类似的下行W-CDMA系统的盲信道估计方案，该方案通过利用数据输出的结构信息来提供多用户子信道的估计。特别是，我们表明(数据+噪声)矩阵的子空间包含足够的信息来唯一确定信道，从而确定签名波形和信号星座。在各种信道衰落条件下，绘制了误码率和均方根误差等性能指标。

{"title":"Downlink blind channel estimation for W-CDMA systems and performance analysis under various fading channel conditions","authors":"M. Sangeetha, V. Bhaskar","doi":"10.1109/ICCSP.2011.5739363","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739363","url":null,"abstract":"Wideband code division multiple access (W-CDMA) is a third generation (3G) mobile wireless technology that promises much higher data speeds for mobile and portable wireless devices than those which are commonly offered in today's market. In W-CDMA systems, while transmitting information over multipath channels, both intersymbol interference (ISI) as a result of interchip interference (ICI) and multiple access interference (MAI) cannot be easily eliminated. Although it is possible to design multiuser detectors that suppress MAI and ISI, these detectors often require explicit knowledge of at least the desired users' signature waveform. Torlak and Xu proposed a blind estimation algorithm for Asynchronous CDMA (A-CDMA) systems to estimate the multiple users' symbols [8]. In our work, we study a similar blind channel estimation scheme for downlink W-CDMA systems that provide estimates of the subchannels of multiple users by exploiting the structural information of the data output. In particular, we show that the subspace of the (data+noise) matrix contains sufficient information for unique determination of channels, and hence, the signature waveforms and signal constellation. Performance measures like bit error rate and root mean square error are plotted for various channel fading conditions.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129821287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Implementation of speaker verification system using Fuzzy Wavelet Network 用模糊小波网络实现说话人验证系统

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739361

P. Shanmugapriya, Y. Venkataramani

A Fuzzy Wavelet network (FWN) is proposed to model the characteristics of a speaker in an automatic speaker verification system in this paper. The neural network using wavelet as activation function is wavelet network (Wavenet). Wavenet has the ability to extract the distinguishable and essential features in frequency rich signals. This is required in classification and identification problems such as speaker verification. Nonlinearity and structured knowledge representation with human perception of fuzzy inference system makes it to be a suitable model for speaker verification when combined with the wavelet network. In this approach, the wavelet theory is combined with the fuzzy based neural network theory which leads to construction of Fuzzy Wavelet Network (FWN). The advantage of fuzzy wavelet network is that the membership functions can be easily merged or divided using the multi resolution properties and the rules can be evaluated during learning. The performance of the proposed speaker verification system is evaluated with TIMIT database. A comparison is made between the proposed system and the system using state of the art model (GMM). Compared with GMM and WNN, FWN provides better verification performance.

本文提出了一种模糊小波网络(FWN)来对说话人自动验证系统中的说话人特征进行建模。以小波为激活函数的神经网络称为小波网络(Wavenet)。小波网络具有从频率丰富的信号中提取可区分的本质特征的能力。这在分类和识别问题(如说话人验证)中是必需的。模糊推理系统的非线性和具有人感知的结构化知识表示使其与小波网络相结合成为一种适合于说话人验证的模型。该方法将小波理论与基于模糊的神经网络理论相结合，构造了模糊小波网络。模糊小波网络的优点是利用模糊小波网络的多分辨率特性可以方便地对隶属函数进行合并或分割，并且可以在学习过程中对规则进行评估。利用TIMIT数据库对该说话人验证系统的性能进行了评价。将所提出的系统与采用最先进模型(GMM)的系统进行了比较。与GMM和WNN相比，FWN具有更好的验证性能。

{"title":"Implementation of speaker verification system using Fuzzy Wavelet Network","authors":"P. Shanmugapriya, Y. Venkataramani","doi":"10.1109/ICCSP.2011.5739361","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739361","url":null,"abstract":"A Fuzzy Wavelet network (FWN) is proposed to model the characteristics of a speaker in an automatic speaker verification system in this paper. The neural network using wavelet as activation function is wavelet network (Wavenet). Wavenet has the ability to extract the distinguishable and essential features in frequency rich signals. This is required in classification and identification problems such as speaker verification. Nonlinearity and structured knowledge representation with human perception of fuzzy inference system makes it to be a suitable model for speaker verification when combined with the wavelet network. In this approach, the wavelet theory is combined with the fuzzy based neural network theory which leads to construction of Fuzzy Wavelet Network (FWN). The advantage of fuzzy wavelet network is that the membership functions can be easily merged or divided using the multi resolution properties and the rules can be evaluated during learning. The performance of the proposed speaker verification system is evaluated with TIMIT database. A comparison is made between the proposed system and the system using state of the art model (GMM). Compared with GMM and WNN, FWN provides better verification performance.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125189890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Speaker independent continuous speech and isolated digit recognition using VQ and HMM 基于VQ和HMM的独立于说话人的连续语音和隔离数字识别

2011 International Conference on Communications and Signal Processing

Pub Date : 2011-03-24 DOI: 10.1109/ICCSP.2011.5739300

A. Revathi, Y. Venkataramani

The main objective of this paper is to explore the effectiveness of perceptual features for performing isolated digits and continuous speech recognition. The proposed perceptual features are captured and code book indices are extracted. Expectation maximization algorithm is used to generate HMM models for the speeches. Speech recognition system is evaluated on clean test speeches and the experimental results reveal the performance of the proposed algorithm in recognizing isolated digits and continuous speeches based on maximum log likelihood value between test features and HMM models for each speech. Performance of these features is tested on speeches randomly chosen from “TI Digits_1”, “TI Digits_2” and “TIMIT” databases. This algorithm is tested for VQ and combination of VQ and HMM speech modeling techniques. Perceptual linear predictive cepstrum yields the accuracy of 86% and 93% for speaker independent isolated digit recognition using VQ and combination of VQ & HMM speech models respectively. This feature also gives 99% and 100% accuracy for speaker independent continuous speech recognition by using VQ and the combination of VQ & HMM speech modeling techniques.

本文的主要目的是探索感知特征在执行孤立数字和连续语音识别中的有效性。所提出的感知特征被捕获，代码本索引被提取。使用期望最大化算法对演讲生成HMM模型。在干净的测试语音上对语音识别系统进行了评估，实验结果表明，基于每个语音的测试特征和HMM模型之间的最大对数似然值，该算法在识别孤立数字和连续语音方面具有良好的性能。这些特征的性能在从“TI Digits_1”、“TI Digits_2”和“TIMIT”数据库中随机选择的演讲上进行测试。对该算法进行了VQ测试，并结合VQ和HMM语音建模技术进行了测试。使用VQ和VQ和HMM语音模型的组合，感知线性预测倒谱分别获得了86%和93%的独立于说话人的孤立数字识别准确率。该特性还通过使用VQ以及VQ和HMM语音建模技术的结合，为独立于说话者的连续语音识别提供99%和100%的准确率。

{"title":"Speaker independent continuous speech and isolated digit recognition using VQ and HMM","authors":"A. Revathi, Y. Venkataramani","doi":"10.1109/ICCSP.2011.5739300","DOIUrl":"https://doi.org/10.1109/ICCSP.2011.5739300","url":null,"abstract":"The main objective of this paper is to explore the effectiveness of perceptual features for performing isolated digits and continuous speech recognition. The proposed perceptual features are captured and code book indices are extracted. Expectation maximization algorithm is used to generate HMM models for the speeches. Speech recognition system is evaluated on clean test speeches and the experimental results reveal the performance of the proposed algorithm in recognizing isolated digits and continuous speeches based on maximum log likelihood value between test features and HMM models for each speech. Performance of these features is tested on speeches randomly chosen from “TI Digits_1”, “TI Digits_2” and “TIMIT” databases. This algorithm is tested for VQ and combination of VQ and HMM speech modeling techniques. Perceptual linear predictive cepstrum yields the accuracy of 86% and 93% for speaker independent isolated digit recognition using VQ and combination of VQ & HMM speech models respectively. This feature also gives 99% and 100% accuracy for speaker independent continuous speech recognition by using VQ and the combination of VQ & HMM speech modeling techniques.","PeriodicalId":408736,"journal":{"name":"2011 International Conference on Communications and Signal Processing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116278919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2011 International Conference on Communications and Signal Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀