一些声音的统计辨别和识别

H. Sayoud, S. Ouamour
{"title":"一些声音的统计辨别和识别","authors":"H. Sayoud, S. Ouamour","doi":"10.1109/IEEEGCC.2006.5686246","DOIUrl":null,"url":null,"abstract":"Given that most of the speech signal recordings are generally mixed with other sounds like music, songs, or noises and knowing that the processing of any speech signal will be easier when we separate the speech area from the non-speech area, we propose a preprocessing method for speech/ non speech discrimination which is also able to identify some acoustic sounds, by using some statistical observations (mean, standard deviation) linked to a statistic measure of similarity (μGc). Since it has been possible to discriminate between speakers thanks to the small within-variability and the large between-variability of the speaker's acoustic features, we thought to extend this property for the purpose of acoustic sounds discrimination. Thus, we led an investigation on different types of sounds as: noises, music and speech (speech signals are extracted from TIMIT database). The purpose of this investigation is to try to define a separate class for each type of sound according to the similarity measure μGc. Experiments showed that the similarity distance range, between speech and other acoustic signals, has a mean and standard deviation which are specific for each sound. So, for instance it will be possible to state whether a particular audio signal is really speech or non-speech, only by observing the statistical range of the μGc which is chosen as a similarity distance. For instance, we have deduced that thanks to the value of μGc it is possible to know if an audio frame is a pure speech or music: if μGc is within [2.5-4.9] then the considered sound should be music.","PeriodicalId":433452,"journal":{"name":"2006 IEEE GCC Conference (GCC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Statistical discrimination and identification of some acoustic sounds\",\"authors\":\"H. Sayoud, S. Ouamour\",\"doi\":\"10.1109/IEEEGCC.2006.5686246\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given that most of the speech signal recordings are generally mixed with other sounds like music, songs, or noises and knowing that the processing of any speech signal will be easier when we separate the speech area from the non-speech area, we propose a preprocessing method for speech/ non speech discrimination which is also able to identify some acoustic sounds, by using some statistical observations (mean, standard deviation) linked to a statistic measure of similarity (μGc). Since it has been possible to discriminate between speakers thanks to the small within-variability and the large between-variability of the speaker's acoustic features, we thought to extend this property for the purpose of acoustic sounds discrimination. Thus, we led an investigation on different types of sounds as: noises, music and speech (speech signals are extracted from TIMIT database). The purpose of this investigation is to try to define a separate class for each type of sound according to the similarity measure μGc. Experiments showed that the similarity distance range, between speech and other acoustic signals, has a mean and standard deviation which are specific for each sound. So, for instance it will be possible to state whether a particular audio signal is really speech or non-speech, only by observing the statistical range of the μGc which is chosen as a similarity distance. For instance, we have deduced that thanks to the value of μGc it is possible to know if an audio frame is a pure speech or music: if μGc is within [2.5-4.9] then the considered sound should be music.\",\"PeriodicalId\":433452,\"journal\":{\"name\":\"2006 IEEE GCC Conference (GCC)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE GCC Conference (GCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEEEGCC.2006.5686246\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE GCC Conference (GCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEEGCC.2006.5686246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

考虑到大多数语音信号记录通常与音乐,歌曲或噪音等其他声音混合在一起,并且知道当我们将语音区域与非语音区域分开时,任何语音信号的处理都会变得更容易,我们提出了一种语音/非语音识别的预处理方法,该方法也能够通过使用与相似度统计度量(μGc)相关的统计观察值(平均值,标准差)来识别某些声音。由于扬声器的声学特征具有较小的内变异性和较大的间变异性,因此可以区分扬声器,因此我们认为可以扩展这一属性以用于声学区分。因此,我们对不同类型的声音进行了调查:噪音,音乐和语音(语音信号从TIMIT数据库中提取)。本研究的目的是尝试根据相似性度量μGc为每种类型的声音定义一个单独的类。实验表明,语音与其他声音信号之间的相似距离范围具有特定于每种声音的平均值和标准差。因此,例如,可以通过观察μGc的统计范围来确定特定的音频信号是语音还是非语音,μGc被选为相似距离。例如,我们已经推导出,由于μGc的值,有可能知道音频帧是纯语音还是音乐:如果μGc在[2.5-4.9]范围内,那么考虑的声音应该是音乐。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Statistical discrimination and identification of some acoustic sounds
Given that most of the speech signal recordings are generally mixed with other sounds like music, songs, or noises and knowing that the processing of any speech signal will be easier when we separate the speech area from the non-speech area, we propose a preprocessing method for speech/ non speech discrimination which is also able to identify some acoustic sounds, by using some statistical observations (mean, standard deviation) linked to a statistic measure of similarity (μGc). Since it has been possible to discriminate between speakers thanks to the small within-variability and the large between-variability of the speaker's acoustic features, we thought to extend this property for the purpose of acoustic sounds discrimination. Thus, we led an investigation on different types of sounds as: noises, music and speech (speech signals are extracted from TIMIT database). The purpose of this investigation is to try to define a separate class for each type of sound according to the similarity measure μGc. Experiments showed that the similarity distance range, between speech and other acoustic signals, has a mean and standard deviation which are specific for each sound. So, for instance it will be possible to state whether a particular audio signal is really speech or non-speech, only by observing the statistical range of the μGc which is chosen as a similarity distance. For instance, we have deduced that thanks to the value of μGc it is possible to know if an audio frame is a pure speech or music: if μGc is within [2.5-4.9] then the considered sound should be music.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Perturbation method based evaluation of power system voltage security Allocating generation to loads and line flows for transmission open access Z-transform PML algorithm for truncating metamaterial FDTD domains A personal search agent system Optimum design of high frequency transformer for compact and light weight switch mode power supplies (SMPS)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1