Robust Forest Sound Classification Using Pareto-Mordukhovich Optimized MFCC in Environmental Monitoring

IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Access Pub Date : 2025-01-28 DOI:10.1109/ACCESS.2025.3535796
Ahmad Qurthobi;Robertas Damaševičius;Vytautas Barzdaitis;Rytis Maskeliūnas
{"title":"Robust Forest Sound Classification Using Pareto-Mordukhovich Optimized MFCC in Environmental Monitoring","authors":"Ahmad Qurthobi;Robertas Damaševičius;Vytautas Barzdaitis;Rytis Maskeliūnas","doi":"10.1109/ACCESS.2025.3535796","DOIUrl":null,"url":null,"abstract":"As a complex ecosystem composed of flora and fauna, the forest has always been vulnerable to threats. Previous researchers utilized environmental audio collections, such as the ESC-50 and UrbanSound8k datasets, as proximate representatives of sounds potentially present in forests. This study focuses on the application of deep learning models for forest sound classification as an effort to establish an early threats detection system. The research evaluates the performance of several pre-trained deep learning models, including MobileNet, GoogleNet, and ResNet, on the limited FSC22 dataset, which consists of 2,025 forest sound recordings classified into 27 categories. To improve classification capabilities, the study introduces a hybrid model that combines neural network (CNN) with a Bidirectional Long-Short-Term Memory (BiLSTM) layer, designed to capture both spatial and temporal features of the sound data. The research also employs Pareto-Mordukhovich-optimized Mel Frequency Cepstral Coefficients (MFCC) for feature extraction, improving the representation of audio signals. Data augmentation and dimensionality reduction techniques were also explored to assess their impact on model performance. The results indicate that the proposed hybrid CNN-BiLSTM model significantly improved classification loss and accuracy scores compared to the standalone pre-trained models. GoogleNet, with an added BiLSTM layer and augmented data, achieved an average reduced loss score of 0.7209 and average accuracy of 0.7852, demonstrating its potential to classify forest sounds. Improvements in loss score and classification performance highlight the potential of hybrid models in environmental sound analysis, particularly in scenarios with limited data availability.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"20923-20944"},"PeriodicalIF":3.6000,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10856116","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10856116/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

As a complex ecosystem composed of flora and fauna, the forest has always been vulnerable to threats. Previous researchers utilized environmental audio collections, such as the ESC-50 and UrbanSound8k datasets, as proximate representatives of sounds potentially present in forests. This study focuses on the application of deep learning models for forest sound classification as an effort to establish an early threats detection system. The research evaluates the performance of several pre-trained deep learning models, including MobileNet, GoogleNet, and ResNet, on the limited FSC22 dataset, which consists of 2,025 forest sound recordings classified into 27 categories. To improve classification capabilities, the study introduces a hybrid model that combines neural network (CNN) with a Bidirectional Long-Short-Term Memory (BiLSTM) layer, designed to capture both spatial and temporal features of the sound data. The research also employs Pareto-Mordukhovich-optimized Mel Frequency Cepstral Coefficients (MFCC) for feature extraction, improving the representation of audio signals. Data augmentation and dimensionality reduction techniques were also explored to assess their impact on model performance. The results indicate that the proposed hybrid CNN-BiLSTM model significantly improved classification loss and accuracy scores compared to the standalone pre-trained models. GoogleNet, with an added BiLSTM layer and augmented data, achieved an average reduced loss score of 0.7209 and average accuracy of 0.7852, demonstrating its potential to classify forest sounds. Improvements in loss score and classification performance highlight the potential of hybrid models in environmental sound analysis, particularly in scenarios with limited data availability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Pareto-Mordukhovich优化MFCC在环境监测中的鲁棒森林声音分类
森林作为一个由动植物组成的复杂生态系统,一直以来都很容易受到威胁。以前的研究人员利用环境音频收集,如ESC-50和UrbanSound8k数据集,作为森林中可能存在的声音的近似代表。本研究的重点是将深度学习模型应用于森林声音分类,建立早期威胁检测系统。该研究在有限的FSC22数据集上评估了几种预训练的深度学习模型的性能,包括MobileNet、GoogleNet和ResNet,该数据集由2025个森林录音组成,分为27个类别。为了提高分类能力,该研究引入了一种混合模型,该模型将神经网络(CNN)与双向长短期记忆(BiLSTM)层相结合,旨在捕捉声音数据的空间和时间特征。该研究还采用pareto - mordukhovitch优化的Mel频率倒谱系数(MFCC)进行特征提取,改善了音频信号的表示。研究还探讨了数据增强和降维技术,以评估它们对模型性能的影响。结果表明,与独立预训练模型相比,所提出的CNN-BiLSTM混合模型显著改善了分类损失和准确率分数。增加BiLSTM层和增强数据后,GoogleNet的平均减少损失分数为0.7209,平均准确率为0.7852,显示了其对森林声音分类的潜力。损失评分和分类性能的改进突出了混合模型在环境声音分析中的潜力,特别是在数据可用性有限的情况下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Access
IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍: IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.
期刊最新文献
Named Entity Recognition With Clue-Word Tags From Patent Documents in Materials Science Development of a Neural Network-Based Model to Generate an Absolute Luminance Map of an Interior Using a Camera Raw Image File Reinforcement Learning-Based Fuzzer for 5G RRC Security Evaluation Cite and Seek: Automated Literary Reference Mining at Corpus Scale RSMA-Enabled RIS-Assisted Integrated Sensing and Communication for 6G: A Comprehensive Survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1