{"title":"高斯混合模型与高斯超向量图像分类","authors":"Yuechi Jiang, F. H. F. Leung","doi":"10.1109/ICDSP.2018.8631558","DOIUrl":null,"url":null,"abstract":"Gaussian Mixture Model (GMM) has been widely used in speech signal and image signal classification tasks. It can be directly used as a classifier, or used as the representation of speech or image signals. Another important usage of GMM is to serve as the Universal Background Model (UBM) to generate speech representations such as Gaussian Supervector (GSV) and i-vector. In this paper, we borrow GSV from speech signal classification studies and apply it as an image representation for image classification. GSV is calculated based on a Universal Background Model (UBM). Apart from employing the conventional GMM as the UBM to calculate GSV, we also propose the Equal-Variance GMM (EV-GMM), where all the variables in all the Gaussian mixture components share the same variance. Moreover, we derive the kernel version of EV-GMM, which generalizes EV-GMM by introducing a kernel. We then compare GSV to the raw image feature and other popular image representations such as Sparse Representation (SR) and Collaborative Representation (CR). Experiments are carried out on a handwritten digit recognition task, and classification results indicate that GSV can work very well and can be even better than other popular image representations. In addition, as the UBM, the proposed EV-GMM can work better than the conventional GMM.","PeriodicalId":218806,"journal":{"name":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Gaussian Mixture Model and Gaussian Supervector for Image Classification\",\"authors\":\"Yuechi Jiang, F. H. F. Leung\",\"doi\":\"10.1109/ICDSP.2018.8631558\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gaussian Mixture Model (GMM) has been widely used in speech signal and image signal classification tasks. It can be directly used as a classifier, or used as the representation of speech or image signals. Another important usage of GMM is to serve as the Universal Background Model (UBM) to generate speech representations such as Gaussian Supervector (GSV) and i-vector. In this paper, we borrow GSV from speech signal classification studies and apply it as an image representation for image classification. GSV is calculated based on a Universal Background Model (UBM). Apart from employing the conventional GMM as the UBM to calculate GSV, we also propose the Equal-Variance GMM (EV-GMM), where all the variables in all the Gaussian mixture components share the same variance. Moreover, we derive the kernel version of EV-GMM, which generalizes EV-GMM by introducing a kernel. We then compare GSV to the raw image feature and other popular image representations such as Sparse Representation (SR) and Collaborative Representation (CR). Experiments are carried out on a handwritten digit recognition task, and classification results indicate that GSV can work very well and can be even better than other popular image representations. In addition, as the UBM, the proposed EV-GMM can work better than the conventional GMM.\",\"PeriodicalId\":218806,\"journal\":{\"name\":\"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)\",\"volume\":\"71 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDSP.2018.8631558\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 23rd International Conference on Digital Signal Processing (DSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSP.2018.8631558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Gaussian Mixture Model and Gaussian Supervector for Image Classification
Gaussian Mixture Model (GMM) has been widely used in speech signal and image signal classification tasks. It can be directly used as a classifier, or used as the representation of speech or image signals. Another important usage of GMM is to serve as the Universal Background Model (UBM) to generate speech representations such as Gaussian Supervector (GSV) and i-vector. In this paper, we borrow GSV from speech signal classification studies and apply it as an image representation for image classification. GSV is calculated based on a Universal Background Model (UBM). Apart from employing the conventional GMM as the UBM to calculate GSV, we also propose the Equal-Variance GMM (EV-GMM), where all the variables in all the Gaussian mixture components share the same variance. Moreover, we derive the kernel version of EV-GMM, which generalizes EV-GMM by introducing a kernel. We then compare GSV to the raw image feature and other popular image representations such as Sparse Representation (SR) and Collaborative Representation (CR). Experiments are carried out on a handwritten digit recognition task, and classification results indicate that GSV can work very well and can be even better than other popular image representations. In addition, as the UBM, the proposed EV-GMM can work better than the conventional GMM.