Choosing an accurate number of mel frequency cepstral coefficients for audio classification purpose

Proceedings of the 10th International Symposium on Image and Signal Processing and Analysis Pub Date : 2017-09-01 DOI:10.1109/ISPA.2017.8073600

L. Grama, C. Rusu

引用次数: 8

Abstract

In this paper, we study several audio classification schemes applied on different number of features for multiclass classification with imbalanced datasets. As features, we proposed the liftering Mel frequency cepstral coefficients, while for classification we use probabilistic methods, instance-based learning algorithms, support vector machines, neural networks, L∞-norm based classifier, fuzzy lattice reasoning classifier, and trees. The final goal is to find the appropriate number of liftering Mel frequency cepstral coefficients to provide the desired accuracy for audio classification purpose. The best results are obtained using 16 features and & k-Nearest Neighbor as a classifier. In this case, the correct classification rate is 99.79%, the false alarm rate is 0.05%, the miss rate is 0.21%, the precision is 99.80% and the F-measure is 99.79%.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

为音频分类选择准确的频率倒谱系数

本文研究了几种基于不同特征数的音频分类方案，用于非平衡数据集的多类分类。作为特征，我们提出了提升Mel频率倒谱系数，而在分类方面，我们使用了概率方法、基于实例的学习算法、支持向量机、神经网络、基于L∞范数的分类器、模糊格推理分类器和树。最终目标是找到适当数量的提升Mel频率倒谱系数，以提供所需的音频分类精度。使用16个特征和& k近邻作为分类器获得了最好的结果。在本例中，分类正确率为99.79%，虚警率为0.05%，漏检率为0.21%，准确率为99.80%，F-measure为99.79%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 10th International Symposium on Image and Signal Processing and Analysis

自引率

0.00%

发文量