Benjamin Benti, Patrick Jo Miller, Heike Vester, Florencia Noriega, Charlotte Cure
{"title":"利用模糊聚类对分级动物发声进行无监督分类","authors":"Benjamin Benti, Patrick Jo Miller, Heike Vester, Florencia Noriega, Charlotte Cure","doi":"10.1101/2024.09.13.612808","DOIUrl":null,"url":null,"abstract":"We present here an unsupervised procedure for the classification of graded animal vocalisations based on Mel frequency cepstral coefficients and fuzzy clustering. Cepstral coefficients compress information about the distribution of energy across the frequency spectrum into a reduced number of variables and are well-defined for signals of various acoustic characteristics (tonal, pulsed, or broadband). In addition, the Mel scale mimics the logarithmic perception of pitch by mammalian ears and is therefore well-suited to defined meaningful perceptual categories for mammals. Fuzzy clustering is a soft classification\napproach. It does not assign samples to a single category, but rather describes their position relative to overlapping categories. This method is capable of identifying stereotyped vocalisations - vocalisations located in a single category - and graded vocalisations - vocalisation which lie between categories - in a quantitative way. We evaluated the performance of this procedure on a set of long-finned pilot whale (Globicephala melas) calls. We compared our results with a call catalogue previously defined through audio-visual inspection of the calls by human experts. Our unsupervised classification achieved slightly lower precision than the catalogue approach: we described between two and ten fuzzy clusters compared to 11 call types in the catalogue. The fuzzy clustering did not replicate the manual classification. One-to-one correspondence between fuzzy clusters and catalogue call types were rare, however the same sets of call types were consistently grouped together within fuzzy clusters. There were also discrepancies between both classification approaches, with some catalogue call types being consistently spread over several fuzzy clusters. Compared to manual classification, the fuzzy clustering approach proved to be much less time-consuming (days vs. months) and provided additional quantitative information about the graded nature of the vocalisations. We discuss the scope of our unsupervised classifier and the need to investigate the functions of call gradation in future research.","PeriodicalId":501210,"journal":{"name":"bioRxiv - Animal Behavior and Cognition","volume":"44 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised classification of graded animal vocalisations using fuzzy clustering\",\"authors\":\"Benjamin Benti, Patrick Jo Miller, Heike Vester, Florencia Noriega, Charlotte Cure\",\"doi\":\"10.1101/2024.09.13.612808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present here an unsupervised procedure for the classification of graded animal vocalisations based on Mel frequency cepstral coefficients and fuzzy clustering. Cepstral coefficients compress information about the distribution of energy across the frequency spectrum into a reduced number of variables and are well-defined for signals of various acoustic characteristics (tonal, pulsed, or broadband). In addition, the Mel scale mimics the logarithmic perception of pitch by mammalian ears and is therefore well-suited to defined meaningful perceptual categories for mammals. Fuzzy clustering is a soft classification\\napproach. It does not assign samples to a single category, but rather describes their position relative to overlapping categories. This method is capable of identifying stereotyped vocalisations - vocalisations located in a single category - and graded vocalisations - vocalisation which lie between categories - in a quantitative way. We evaluated the performance of this procedure on a set of long-finned pilot whale (Globicephala melas) calls. We compared our results with a call catalogue previously defined through audio-visual inspection of the calls by human experts. Our unsupervised classification achieved slightly lower precision than the catalogue approach: we described between two and ten fuzzy clusters compared to 11 call types in the catalogue. The fuzzy clustering did not replicate the manual classification. One-to-one correspondence between fuzzy clusters and catalogue call types were rare, however the same sets of call types were consistently grouped together within fuzzy clusters. There were also discrepancies between both classification approaches, with some catalogue call types being consistently spread over several fuzzy clusters. Compared to manual classification, the fuzzy clustering approach proved to be much less time-consuming (days vs. months) and provided additional quantitative information about the graded nature of the vocalisations. We discuss the scope of our unsupervised classifier and the need to investigate the functions of call gradation in future research.\",\"PeriodicalId\":501210,\"journal\":{\"name\":\"bioRxiv - Animal Behavior and Cognition\",\"volume\":\"44 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Animal Behavior and Cognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.09.13.612808\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Animal Behavior and Cognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.13.612808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unsupervised classification of graded animal vocalisations using fuzzy clustering
We present here an unsupervised procedure for the classification of graded animal vocalisations based on Mel frequency cepstral coefficients and fuzzy clustering. Cepstral coefficients compress information about the distribution of energy across the frequency spectrum into a reduced number of variables and are well-defined for signals of various acoustic characteristics (tonal, pulsed, or broadband). In addition, the Mel scale mimics the logarithmic perception of pitch by mammalian ears and is therefore well-suited to defined meaningful perceptual categories for mammals. Fuzzy clustering is a soft classification
approach. It does not assign samples to a single category, but rather describes their position relative to overlapping categories. This method is capable of identifying stereotyped vocalisations - vocalisations located in a single category - and graded vocalisations - vocalisation which lie between categories - in a quantitative way. We evaluated the performance of this procedure on a set of long-finned pilot whale (Globicephala melas) calls. We compared our results with a call catalogue previously defined through audio-visual inspection of the calls by human experts. Our unsupervised classification achieved slightly lower precision than the catalogue approach: we described between two and ten fuzzy clusters compared to 11 call types in the catalogue. The fuzzy clustering did not replicate the manual classification. One-to-one correspondence between fuzzy clusters and catalogue call types were rare, however the same sets of call types were consistently grouped together within fuzzy clusters. There were also discrepancies between both classification approaches, with some catalogue call types being consistently spread over several fuzzy clusters. Compared to manual classification, the fuzzy clustering approach proved to be much less time-consuming (days vs. months) and provided additional quantitative information about the graded nature of the vocalisations. We discuss the scope of our unsupervised classifier and the need to investigate the functions of call gradation in future research.