{"title":"Beyond amplitude: Phase integration in bird vocalization recognition with MHAResNet","authors":"Jiangjian Xie , Zhulin Hao , Chunhe Hu , Changchun Zhang , Junguo Zhang","doi":"10.1016/j.avrs.2025.100229","DOIUrl":null,"url":null,"abstract":"<div><div>Bird vocalizations are pivotal for ecological monitoring, providing insights into biodiversity and ecosystem health. Traditional recognition methods often neglect phase information, resulting in incomplete feature representation. In this paper, we introduce a novel approach to bird vocalization recognition (BVR) that integrates both amplitude and phase information, leading to enhanced species identification. We propose MHAResNet, a deep learning (DL) model that employs residual blocks and a multi-head attention mechanism to capture salient features from logarithmic power (POW), Instantaneous Frequency (IF), and Group Delay (GD) extracted from bird vocalizations. Experiments on three bird vocalization datasets demonstrate our method's superior performance, achieving accuracy rates of 94%, 98.9%, and 87.1% respectively. These results indicate that our approach provides a more effective representation of bird vocalizations, outperforming existing methods. This integration of phase information in BVR is innovative and significantly advances the field of automatic bird monitoring technology, offering valuable tools for ecological research and conservation efforts.</div></div>","PeriodicalId":51311,"journal":{"name":"Avian Research","volume":"16 1","pages":"Article 100229"},"PeriodicalIF":1.6000,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Avian Research","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2053716625000088","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORNITHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Bird vocalizations are pivotal for ecological monitoring, providing insights into biodiversity and ecosystem health. Traditional recognition methods often neglect phase information, resulting in incomplete feature representation. In this paper, we introduce a novel approach to bird vocalization recognition (BVR) that integrates both amplitude and phase information, leading to enhanced species identification. We propose MHAResNet, a deep learning (DL) model that employs residual blocks and a multi-head attention mechanism to capture salient features from logarithmic power (POW), Instantaneous Frequency (IF), and Group Delay (GD) extracted from bird vocalizations. Experiments on three bird vocalization datasets demonstrate our method's superior performance, achieving accuracy rates of 94%, 98.9%, and 87.1% respectively. These results indicate that our approach provides a more effective representation of bird vocalizations, outperforming existing methods. This integration of phase information in BVR is innovative and significantly advances the field of automatic bird monitoring technology, offering valuable tools for ecological research and conservation efforts.
期刊介绍:
Avian Research is an open access, peer-reviewed journal publishing high quality research and review articles on all aspects of ornithology from all over the world. It aims to report the latest and most significant progress in ornithology and to encourage exchange of ideas among international ornithologists. As an open access journal, Avian Research provides a unique opportunity to publish high quality contents that will be internationally accessible to any reader at no cost.