{"title":"Phonetic content impact on Forensic Voice Comparison","authors":"Ajili. Moez, Bonastre Jean-François, Ben Kheder Waad, Rossato Solange, Kahn Juliette","doi":"10.1109/SLT.2016.7846267","DOIUrl":null,"url":null,"abstract":"Forensic Voice Comparison (FVC) is increasingly using the likelihood ratio (LR) in order to indicate whether the evidence supports the prosecution (same-speaker) or defender (different-speakers) hypotheses. In addition to support one hypothesis, the LR provides a theoretically founded estimate of the relative strength of its support. Despite this nice theoretical aspect, the LR accepts some practical limitations due both to its estimation process itself and to a lack of knowledge about the reliability of this (practical) estimation process. In a large set of situations, a lack in reliability at the estimation process level potentially destroys the reliability of the resulting LR. It is particularly true when automatic FVC is considered, as Automatic Speaker Recognition (ASpR) systems are outputting a score in all situations regardless of the case specific conditions. Furthermore, ASpR systems use different normalization steps to see their scores as LR and these normalization steps are potential sources of bias. In the LR estimation done by ASpR systems, different factors are not taken into account such as the amount of information involved in the comparison, the phonemic content and finally the speaker intrinsic characteristics, denoted here ”speaker factor”. Consequently, a more complete view of reliability seems to be a mandatory point for FVC, even if a LR-like approach is used. This article focuses on the impact of phonemic content on FVC performance and variability. The experimental part is using FABIOLE database. This database is dedicated to this kind of studies and allows to examine both interspeaker variability and intra-speaker variability. The results demonstrate the importance of the phonemic content and highlight interesting differences between inter-speakers effects and intra-speaker’s ones.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"232 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846267","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25
Abstract
Forensic Voice Comparison (FVC) is increasingly using the likelihood ratio (LR) in order to indicate whether the evidence supports the prosecution (same-speaker) or defender (different-speakers) hypotheses. In addition to support one hypothesis, the LR provides a theoretically founded estimate of the relative strength of its support. Despite this nice theoretical aspect, the LR accepts some practical limitations due both to its estimation process itself and to a lack of knowledge about the reliability of this (practical) estimation process. In a large set of situations, a lack in reliability at the estimation process level potentially destroys the reliability of the resulting LR. It is particularly true when automatic FVC is considered, as Automatic Speaker Recognition (ASpR) systems are outputting a score in all situations regardless of the case specific conditions. Furthermore, ASpR systems use different normalization steps to see their scores as LR and these normalization steps are potential sources of bias. In the LR estimation done by ASpR systems, different factors are not taken into account such as the amount of information involved in the comparison, the phonemic content and finally the speaker intrinsic characteristics, denoted here ”speaker factor”. Consequently, a more complete view of reliability seems to be a mandatory point for FVC, even if a LR-like approach is used. This article focuses on the impact of phonemic content on FVC performance and variability. The experimental part is using FABIOLE database. This database is dedicated to this kind of studies and allows to examine both interspeaker variability and intra-speaker variability. The results demonstrate the importance of the phonemic content and highlight interesting differences between inter-speakers effects and intra-speaker’s ones.