Baby Crying Sound Classification using Convolutional Neural Network

Naufal Fikri Muhammad, Raimi Dewan, Jaysuman Pusppanathan, Faishal Adilah Suryanata
{"title":"Baby Crying Sound Classification using Convolutional Neural Network","authors":"Naufal Fikri Muhammad, Raimi Dewan, Jaysuman Pusppanathan, Faishal Adilah Suryanata","doi":"10.11113/humentech.v3n1.66","DOIUrl":null,"url":null,"abstract":"Crying is a crucial means of communication for newborns, crying is a newborn's early form of communication. Many individuals are unable to recognise a baby's intention from cry unless they have the appropriate training or expertise, such as nurses, paediatricians, and childcare professionals. Accurately interpreting a baby's cry can be challenging. In this research paper, the study uses a method for classifying baby crying sounds using a Convolutional Neural Network (CNN) and the dataset includes belly pain, burping, discomfort, hungry, and tired for total of 3,495 one-second-long audio clips. The research methodology involves preprocessing the audio data, extracting Mel-Frequency Cepstral Coefficients (MFCC) as features, and training the CNN model. To determine the optimal architecture, two different configurations of the CNN model are evaluated. The settings for both configurations are the same, except for the layers. The first configuration utilizes 100, 200, and 100 neurons for the respective layers, while the second configuration employs 256, 512, and 256 neurons for each layer. the results have already been evaluated that the second configuration, with deeper and more complex layers, achieves higher accuracy (86%) compared to the first configuration (84%). The study demonstrates the effectiveness of CNNs in classifying baby cries and highlights the importance of model architecture in achieving accurate classification results. Future research could explore larger and more diverse datasets to improve generalizability.","PeriodicalId":168265,"journal":{"name":"Journal of Human Centered Technology","volume":"103 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Human Centered Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/humentech.v3n1.66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Crying is a crucial means of communication for newborns, crying is a newborn's early form of communication. Many individuals are unable to recognise a baby's intention from cry unless they have the appropriate training or expertise, such as nurses, paediatricians, and childcare professionals. Accurately interpreting a baby's cry can be challenging. In this research paper, the study uses a method for classifying baby crying sounds using a Convolutional Neural Network (CNN) and the dataset includes belly pain, burping, discomfort, hungry, and tired for total of 3,495 one-second-long audio clips. The research methodology involves preprocessing the audio data, extracting Mel-Frequency Cepstral Coefficients (MFCC) as features, and training the CNN model. To determine the optimal architecture, two different configurations of the CNN model are evaluated. The settings for both configurations are the same, except for the layers. The first configuration utilizes 100, 200, and 100 neurons for the respective layers, while the second configuration employs 256, 512, and 256 neurons for each layer. the results have already been evaluated that the second configuration, with deeper and more complex layers, achieves higher accuracy (86%) compared to the first configuration (84%). The study demonstrates the effectiveness of CNNs in classifying baby cries and highlights the importance of model architecture in achieving accurate classification results. Future research could explore larger and more diverse datasets to improve generalizability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用卷积神经网络进行婴儿哭声分类
哭泣是新生儿重要的交流方式,哭泣是新生儿早期的交流形式。很多人都无法从哭声中辨别出婴儿的意图,除非他们受过适当的培训或拥有相关的专业知识,如护士、儿科医生和儿童保育专业人员。准确解读婴儿的哭声是一项挑战。在本研究论文中,研究使用卷积神经网络(CNN)对婴儿哭声进行分类,数据集包括肚痛、打嗝、不舒服、饿和累,共计 3,495 个一秒钟长的音频片段。研究方法包括预处理音频数据、提取梅尔频率倒频谱系数(MFCC)作为特征,以及训练 CNN 模型。为确定最佳架构,对 CNN 模型的两种不同配置进行了评估。除了层数外,两种配置的设置相同。第一种配置的各层分别使用了 100、200 和 100 个神经元,而第二种配置的各层分别使用了 256、512 和 256 个神经元。评估结果表明,与第一种配置(84%)相比,第二种配置的各层更深、更复杂,准确率更高(86%)。这项研究证明了 CNN 在对婴儿哭声进行分类方面的有效性,并强调了模型架构对获得准确分类结果的重要性。未来的研究可以探索更大、更多样化的数据集,以提高通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Development of Mobile App as an Educational Tool for Understanding Nanomedicine Creating a Framework for Virtual Reality Learning Environment (VRLE) for Studio-Based Learning Digital Chess Clock for Visually Impaired Players MediLog: A Pilot Study of Online Management System for Medical Device Status and Loan Enhanced Irradiance Levels using Synergistically Engineered Monochromatic Wavelength Ultraviolet-C Arrays Configuration
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1