Feature Data Reduction of MFCC Using PCA and SVD in Speech Recognition System

A. Winursito, Risanuri Hidayat, Agus Bejo, Muhammad Nur Yasir Utomo
{"title":"Feature Data Reduction of MFCC Using PCA and SVD in Speech Recognition System","authors":"A. Winursito, Risanuri Hidayat, Agus Bejo, Muhammad Nur Yasir Utomo","doi":"10.1109/ICSCEE.2018.8538414","DOIUrl":null,"url":null,"abstract":"The development of the pattern recognition system has increased rapidly in this century. Many developments of methods have been done. Mel Frequency Cepstral Coefficients (MFCC) is a popular feature extraction method but still has many disadvantages, especially regarding the level of accuracy and the high dimensional feature of the extraction method. This paper presents the feature data reduction of MFCC using Principal Component Analysis (PCA) and Singular Value Decomposition (SVD). Combining MFCC and data reduction methods, it is expected to improve the accuracy and increase the computational speed of the classification process by decreasing the dimensions of feature data. The result of extraction MFCC feature data plus the delta coefficient forms the matrix data which will be combined with the data reduction method. The data reduction process is designed into two versions. Then the results of data reduction are done classification process with Support Vector Machine (SVM) method. The dataset is composed of 140 recorded speech data from 28 speakers. The results showed that MFCC + PCA version 2 and MFCC + SVD version 1 were able to provide the maximum accuracy improvement with an increase of accuracy from conventional MFCC method from 83.57% to 90.71%. In addition, MFCC + PCA version 2 and MFCC + SVD version 1 method can accelerate the process of classification in speech recognition system from 7.819 seconds into for about 7.6 seconds by decreasing dimension of feature data from 26 into 10 for MFCC + PCA version 2 and decreasing dimension of feature data from 26 into 14 for MFCC + SVD version 1.","PeriodicalId":265737,"journal":{"name":"2018 International Conference on Smart Computing and Electronic Enterprise (ICSCEE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Smart Computing and Electronic Enterprise (ICSCEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCEE.2018.8538414","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

The development of the pattern recognition system has increased rapidly in this century. Many developments of methods have been done. Mel Frequency Cepstral Coefficients (MFCC) is a popular feature extraction method but still has many disadvantages, especially regarding the level of accuracy and the high dimensional feature of the extraction method. This paper presents the feature data reduction of MFCC using Principal Component Analysis (PCA) and Singular Value Decomposition (SVD). Combining MFCC and data reduction methods, it is expected to improve the accuracy and increase the computational speed of the classification process by decreasing the dimensions of feature data. The result of extraction MFCC feature data plus the delta coefficient forms the matrix data which will be combined with the data reduction method. The data reduction process is designed into two versions. Then the results of data reduction are done classification process with Support Vector Machine (SVM) method. The dataset is composed of 140 recorded speech data from 28 speakers. The results showed that MFCC + PCA version 2 and MFCC + SVD version 1 were able to provide the maximum accuracy improvement with an increase of accuracy from conventional MFCC method from 83.57% to 90.71%. In addition, MFCC + PCA version 2 and MFCC + SVD version 1 method can accelerate the process of classification in speech recognition system from 7.819 seconds into for about 7.6 seconds by decreasing dimension of feature data from 26 into 10 for MFCC + PCA version 2 and decreasing dimension of feature data from 26 into 14 for MFCC + SVD version 1.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语音识别系统中基于PCA和SVD的MFCC特征数据约简
模式识别系统在本世纪发展迅速。许多方法的发展已经完成。Mel频率倒谱系数(MFCC)是一种流行的特征提取方法,但它仍然存在许多缺点,特别是在提取精度水平和高维特征方面。本文利用主成分分析(PCA)和奇异值分解(SVD)对MFCC的特征数据进行约简。结合MFCC和数据约简方法,期望通过降低特征数据的维数来提高分类过程的准确率和计算速度。MFCC特征数据的提取结果加上delta系数形成矩阵数据,并与数据约简方法相结合。数据约简过程被设计成两个版本。然后利用支持向量机方法对数据约简后的结果进行分类处理。该数据集由28位说话者的140个记录语音数据组成。结果表明,MFCC + PCA版本2和MFCC + SVD版本1的准确率提高幅度最大,比传统MFCC方法的准确率从83.57%提高到90.71%。此外,MFCC + PCA version 2和MFCC + SVD version 1方法通过将MFCC + PCA version 2的特征数据从26维降为10维,将MFCC + SVD version 1的特征数据从26维降为14维,将语音识别系统的分类过程从7.819秒加快到7.6秒左右。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
NotPetya: Cyber Attack Prevention through Awareness via Gamification Accurate Disparity Map Estimation Based on Edge-preserving Filter Extended User Centered Design (UCD) Process in the Aspect of Human Computer Interaction A Review of Evidence Extraction Techniques in Big Data Environment Challenges and Benefits of Modern Code Review-Systematic Literature Review Protocol
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1