{"title":"基于频谱图的声音分类在监视中的应用","authors":"Yingjie Li, Gang Liu","doi":"10.1109/ICNIDC.2016.7974583","DOIUrl":null,"url":null,"abstract":"This paper presents an audio event classification algorithm which automatically classifies an audio event as footstep, glass breaking, gunshot or scream mainly for surveillance applications. First, the Gabor feature of the audio spectrogram is extracted, there are two kinds of Gabor features, namely global Gabor feature and local Gabor feature. Then we use Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA) to compress the feature dimension, finally the K nearest neighbor classifier (KNN) is used to recognize audio events. We carried out extensive experiments on the clean and noisy audio sets. Our results demonstrate that the algorithm is able to guarantee a recall of 96.1% on clean sets and is proved to be more effective than traditional methods.","PeriodicalId":439987,"journal":{"name":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Sound classification based on spectrogram for surveillance applications\",\"authors\":\"Yingjie Li, Gang Liu\",\"doi\":\"10.1109/ICNIDC.2016.7974583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an audio event classification algorithm which automatically classifies an audio event as footstep, glass breaking, gunshot or scream mainly for surveillance applications. First, the Gabor feature of the audio spectrogram is extracted, there are two kinds of Gabor features, namely global Gabor feature and local Gabor feature. Then we use Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA) to compress the feature dimension, finally the K nearest neighbor classifier (KNN) is used to recognize audio events. We carried out extensive experiments on the clean and noisy audio sets. Our results demonstrate that the algorithm is able to guarantee a recall of 96.1% on clean sets and is proved to be more effective than traditional methods.\",\"PeriodicalId\":439987,\"journal\":{\"name\":\"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNIDC.2016.7974583\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNIDC.2016.7974583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sound classification based on spectrogram for surveillance applications
This paper presents an audio event classification algorithm which automatically classifies an audio event as footstep, glass breaking, gunshot or scream mainly for surveillance applications. First, the Gabor feature of the audio spectrogram is extracted, there are two kinds of Gabor features, namely global Gabor feature and local Gabor feature. Then we use Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA) to compress the feature dimension, finally the K nearest neighbor classifier (KNN) is used to recognize audio events. We carried out extensive experiments on the clean and noisy audio sets. Our results demonstrate that the algorithm is able to guarantee a recall of 96.1% on clean sets and is proved to be more effective than traditional methods.