语音内容对法医语音比对的影响

Ajili. Moez, Bonastre Jean-François, Ben Kheder Waad, Rossato Solange, Kahn Juliette
{"title":"语音内容对法医语音比对的影响","authors":"Ajili. Moez, Bonastre Jean-François, Ben Kheder Waad, Rossato Solange, Kahn Juliette","doi":"10.1109/SLT.2016.7846267","DOIUrl":null,"url":null,"abstract":"Forensic Voice Comparison (FVC) is increasingly using the likelihood ratio (LR) in order to indicate whether the evidence supports the prosecution (same-speaker) or defender (different-speakers) hypotheses. In addition to support one hypothesis, the LR provides a theoretically founded estimate of the relative strength of its support. Despite this nice theoretical aspect, the LR accepts some practical limitations due both to its estimation process itself and to a lack of knowledge about the reliability of this (practical) estimation process. In a large set of situations, a lack in reliability at the estimation process level potentially destroys the reliability of the resulting LR. It is particularly true when automatic FVC is considered, as Automatic Speaker Recognition (ASpR) systems are outputting a score in all situations regardless of the case specific conditions. Furthermore, ASpR systems use different normalization steps to see their scores as LR and these normalization steps are potential sources of bias. In the LR estimation done by ASpR systems, different factors are not taken into account such as the amount of information involved in the comparison, the phonemic content and finally the speaker intrinsic characteristics, denoted here ”speaker factor”. Consequently, a more complete view of reliability seems to be a mandatory point for FVC, even if a LR-like approach is used. This article focuses on the impact of phonemic content on FVC performance and variability. The experimental part is using FABIOLE database. This database is dedicated to this kind of studies and allows to examine both interspeaker variability and intra-speaker variability. The results demonstrate the importance of the phonemic content and highlight interesting differences between inter-speakers effects and intra-speaker’s ones.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"232 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":"{\"title\":\"Phonetic content impact on Forensic Voice Comparison\",\"authors\":\"Ajili. Moez, Bonastre Jean-François, Ben Kheder Waad, Rossato Solange, Kahn Juliette\",\"doi\":\"10.1109/SLT.2016.7846267\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Forensic Voice Comparison (FVC) is increasingly using the likelihood ratio (LR) in order to indicate whether the evidence supports the prosecution (same-speaker) or defender (different-speakers) hypotheses. In addition to support one hypothesis, the LR provides a theoretically founded estimate of the relative strength of its support. Despite this nice theoretical aspect, the LR accepts some practical limitations due both to its estimation process itself and to a lack of knowledge about the reliability of this (practical) estimation process. In a large set of situations, a lack in reliability at the estimation process level potentially destroys the reliability of the resulting LR. It is particularly true when automatic FVC is considered, as Automatic Speaker Recognition (ASpR) systems are outputting a score in all situations regardless of the case specific conditions. Furthermore, ASpR systems use different normalization steps to see their scores as LR and these normalization steps are potential sources of bias. In the LR estimation done by ASpR systems, different factors are not taken into account such as the amount of information involved in the comparison, the phonemic content and finally the speaker intrinsic characteristics, denoted here ”speaker factor”. Consequently, a more complete view of reliability seems to be a mandatory point for FVC, even if a LR-like approach is used. This article focuses on the impact of phonemic content on FVC performance and variability. The experimental part is using FABIOLE database. This database is dedicated to this kind of studies and allows to examine both interspeaker variability and intra-speaker variability. The results demonstrate the importance of the phonemic content and highlight interesting differences between inter-speakers effects and intra-speaker’s ones.\",\"PeriodicalId\":281635,\"journal\":{\"name\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"232 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"25\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2016.7846267\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846267","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

摘要

法医声音比较(FVC)越来越多地使用似然比(LR)来表明证据是否支持原告(同一说话人)或辩护人(不同说话人)的假设。除了支持一个假设之外,LR还提供了对其支持的相对强度的理论基础估计。尽管理论方面很好,但由于其估计过程本身以及缺乏对该(实际)估计过程可靠性的了解,LR接受了一些实际限制。在很多情况下,在评估过程层面缺乏可靠性可能会破坏最终LR的可靠性。当考虑到自动FVC时尤其如此,因为自动说话人识别(ASpR)系统在所有情况下都输出分数,而不管具体情况如何。此外,ASpR系统使用不同的归一化步骤将其分数视为LR,这些归一化步骤是偏差的潜在来源。在ASpR系统进行的LR估计中,没有考虑不同的因素,如比较中涉及的信息量、音位内容以及说话人的内在特征,这里记为“说话人因素”。因此,一个更完整的可靠性视图似乎是FVC的一个强制性点,即使使用类似lr的方法。本文主要研究音素含量对FVC性能和变异性的影响。实验部分使用FABIOLE数据库。这个数据库专门用于这类研究,并允许检查说话人之间和说话人内部的变化。结果表明了音素内容的重要性,并突出了说话人之间和说话人内部的影响之间的有趣差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Phonetic content impact on Forensic Voice Comparison
Forensic Voice Comparison (FVC) is increasingly using the likelihood ratio (LR) in order to indicate whether the evidence supports the prosecution (same-speaker) or defender (different-speakers) hypotheses. In addition to support one hypothesis, the LR provides a theoretically founded estimate of the relative strength of its support. Despite this nice theoretical aspect, the LR accepts some practical limitations due both to its estimation process itself and to a lack of knowledge about the reliability of this (practical) estimation process. In a large set of situations, a lack in reliability at the estimation process level potentially destroys the reliability of the resulting LR. It is particularly true when automatic FVC is considered, as Automatic Speaker Recognition (ASpR) systems are outputting a score in all situations regardless of the case specific conditions. Furthermore, ASpR systems use different normalization steps to see their scores as LR and these normalization steps are potential sources of bias. In the LR estimation done by ASpR systems, different factors are not taken into account such as the amount of information involved in the comparison, the phonemic content and finally the speaker intrinsic characteristics, denoted here ”speaker factor”. Consequently, a more complete view of reliability seems to be a mandatory point for FVC, even if a LR-like approach is used. This article focuses on the impact of phonemic content on FVC performance and variability. The experimental part is using FABIOLE database. This database is dedicated to this kind of studies and allows to examine both interspeaker variability and intra-speaker variability. The results demonstrate the importance of the phonemic content and highlight interesting differences between inter-speakers effects and intra-speaker’s ones.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification Learning dialogue dynamics with the method of moments A study of speech distortion conditions in real scenarios for speech processing applications Comparing speaker independent and speaker adapted classification for word prominence detection Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1