Massa Baali, Abdulhamid Aldoobi, Hira Dhamyal, Rita Singh, Bhiksha Raj
{"title":"PDAF:用于验证说话人的语音去重注意框架","authors":"Massa Baali, Abdulhamid Aldoobi, Hira Dhamyal, Rita Singh, Bhiksha Raj","doi":"arxiv-2409.05799","DOIUrl":null,"url":null,"abstract":"Speaker verification systems are crucial for authenticating identity through\nvoice. Traditionally, these systems focus on comparing feature vectors,\noverlooking the speech's content. However, this paper challenges this by\nhighlighting the importance of phonetic dominance, a measure of the frequency\nor duration of phonemes, as a crucial cue in speaker verification. A novel\nPhoneme Debiasing Attention Framework (PDAF) is introduced, integrating with\nexisting attention frameworks to mitigate biases caused by phonetic dominance.\nPDAF adjusts the weighting for each phoneme and influences feature extraction,\nallowing for a more nuanced analysis of speech. This approach paves the way for\nmore accurate and reliable identity authentication through voice. Furthermore,\nby employing various weighting strategies, we evaluate the influence of\nphonetic features on the efficacy of the speaker verification system.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":"101 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification\",\"authors\":\"Massa Baali, Abdulhamid Aldoobi, Hira Dhamyal, Rita Singh, Bhiksha Raj\",\"doi\":\"arxiv-2409.05799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speaker verification systems are crucial for authenticating identity through\\nvoice. Traditionally, these systems focus on comparing feature vectors,\\noverlooking the speech's content. However, this paper challenges this by\\nhighlighting the importance of phonetic dominance, a measure of the frequency\\nor duration of phonemes, as a crucial cue in speaker verification. A novel\\nPhoneme Debiasing Attention Framework (PDAF) is introduced, integrating with\\nexisting attention frameworks to mitigate biases caused by phonetic dominance.\\nPDAF adjusts the weighting for each phoneme and influences feature extraction,\\nallowing for a more nuanced analysis of speech. This approach paves the way for\\nmore accurate and reliable identity authentication through voice. Furthermore,\\nby employing various weighting strategies, we evaluate the influence of\\nphonetic features on the efficacy of the speaker verification system.\",\"PeriodicalId\":501178,\"journal\":{\"name\":\"arXiv - CS - Sound\",\"volume\":\"101 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Sound\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05799\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PDAF: A Phonetic Debiasing Attention Framework For Speaker Verification
Speaker verification systems are crucial for authenticating identity through
voice. Traditionally, these systems focus on comparing feature vectors,
overlooking the speech's content. However, this paper challenges this by
highlighting the importance of phonetic dominance, a measure of the frequency
or duration of phonemes, as a crucial cue in speaker verification. A novel
Phoneme Debiasing Attention Framework (PDAF) is introduced, integrating with
existing attention frameworks to mitigate biases caused by phonetic dominance.
PDAF adjusts the weighting for each phoneme and influences feature extraction,
allowing for a more nuanced analysis of speech. This approach paves the way for
more accurate and reliable identity authentication through voice. Furthermore,
by employing various weighting strategies, we evaluate the influence of
phonetic features on the efficacy of the speaker verification system.