采用修正随机梯度下降优化器的动态大小滤波器卷积神经网络用于声音分类

Manu Pratap Singh, Pratibha Rashmi
{"title":"采用修正随机梯度下降优化器的动态大小滤波器卷积神经网络用于声音分类","authors":"Manu Pratap Singh, Pratibha Rashmi","doi":"10.3844/jcssp.2024.69.87","DOIUrl":null,"url":null,"abstract":": Deep Neural Networks (DNNs), specifically Convolution Neural Networks (CNNs) are found well suited to address the problem of sound classification due to their ability to capture the pattern of time and frequency domain. Mostly the convolutional neural networks are trained and tested with time-frequency patches of sound samples in the form of 2D pattern vectors. Generally, existing pre-trained convolutional neural network models use static-sized filters in all the convolution layers. In this present work, we consider the three different types of convolutional neural network architectures with different variable-size filters. The training set pattern vectors of time and frequency dimensions are constructed with the input samples of the spectrogram. In our proposed architectures, the size of kernels and the number of kernels are considered with a scale of variable length instead of fixed-size filters and static channels. The paper further presents the reformulation of a minibatch stochastic gradient descent optimizer with adaptive learning rate parameters according to the proposed architectures. The experimental results are obtained on the existing dataset of sound samples. The simulated results show the better performance of the proposed convolutional neural network architectures over existing pre-trained networks on the same dataset.","PeriodicalId":40005,"journal":{"name":"Journal of Computer Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Convolution Neural Networks of Dynamically Sized Filters with Modified Stochastic Gradient Descent Optimizer for Sound Classification\",\"authors\":\"Manu Pratap Singh, Pratibha Rashmi\",\"doi\":\"10.3844/jcssp.2024.69.87\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Deep Neural Networks (DNNs), specifically Convolution Neural Networks (CNNs) are found well suited to address the problem of sound classification due to their ability to capture the pattern of time and frequency domain. Mostly the convolutional neural networks are trained and tested with time-frequency patches of sound samples in the form of 2D pattern vectors. Generally, existing pre-trained convolutional neural network models use static-sized filters in all the convolution layers. In this present work, we consider the three different types of convolutional neural network architectures with different variable-size filters. The training set pattern vectors of time and frequency dimensions are constructed with the input samples of the spectrogram. In our proposed architectures, the size of kernels and the number of kernels are considered with a scale of variable length instead of fixed-size filters and static channels. The paper further presents the reformulation of a minibatch stochastic gradient descent optimizer with adaptive learning rate parameters according to the proposed architectures. The experimental results are obtained on the existing dataset of sound samples. The simulated results show the better performance of the proposed convolutional neural network architectures over existing pre-trained networks on the same dataset.\",\"PeriodicalId\":40005,\"journal\":{\"name\":\"Journal of Computer Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3844/jcssp.2024.69.87\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3844/jcssp.2024.69.87","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

:深度神经网络(DNN),特别是卷积神经网络(CNN),因其捕捉时域和频域模式的能力,被认为非常适合解决声音分类问题。卷积神经网络通常以二维模式向量的形式,使用声音样本的时频片段进行训练和测试。一般来说,现有的预训练卷积神经网络模型在所有卷积层中都使用静态大小的滤波器。在本研究中,我们考虑了三种不同类型的卷积神经网络架构,并采用了不同大小的滤波器。时间和频率维度的训练集模式向量是用频谱图的输入样本构建的。在我们提出的架构中,考虑的是核的大小和核的数量,而不是固定大小的滤波器和静态通道。本文还根据所提出的架构,进一步介绍了具有自适应学习率参数的小批量随机梯度下降优化器的重构。实验结果是在现有的声音样本数据集上获得的。模拟结果表明,在相同的数据集上,所提出的卷积神经网络架构比现有的预训练网络具有更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Convolution Neural Networks of Dynamically Sized Filters with Modified Stochastic Gradient Descent Optimizer for Sound Classification
: Deep Neural Networks (DNNs), specifically Convolution Neural Networks (CNNs) are found well suited to address the problem of sound classification due to their ability to capture the pattern of time and frequency domain. Mostly the convolutional neural networks are trained and tested with time-frequency patches of sound samples in the form of 2D pattern vectors. Generally, existing pre-trained convolutional neural network models use static-sized filters in all the convolution layers. In this present work, we consider the three different types of convolutional neural network architectures with different variable-size filters. The training set pattern vectors of time and frequency dimensions are constructed with the input samples of the spectrogram. In our proposed architectures, the size of kernels and the number of kernels are considered with a scale of variable length instead of fixed-size filters and static channels. The paper further presents the reformulation of a minibatch stochastic gradient descent optimizer with adaptive learning rate parameters according to the proposed architectures. The experimental results are obtained on the existing dataset of sound samples. The simulated results show the better performance of the proposed convolutional neural network architectures over existing pre-trained networks on the same dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Computer Science
Journal of Computer Science Computer Science-Computer Networks and Communications
CiteScore
1.70
自引率
0.00%
发文量
92
期刊介绍: Journal of Computer Science is aimed to publish research articles on theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. JCS updated twelve times a year and is a peer reviewed journal covers the latest and most compelling research of the time.
期刊最新文献
Features of the Security System Development of a Computer Telecommunication Network Performance Assessment of CPU Scheduling Algorithms: A Scenario-Based Approach with FCFS, RR, and SJF Website-Based Educational Application to Help MSMEs in Indonesia Develop A Multi-Split Cross-Strategy for Enhancing Machine Learning Algorithms Prediction Results with Data Generated by Conditional Generative Adversarial Network Improving the Detection of Mask-Wearing Mistakes by Deep Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1