Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu
{"title":"SpMis: An Investigation of Synthetic Spoken Misinformation Detection","authors":"Peizhuo Liu, Li Wang, Renqiang He, Haorui He, Lei Wang, Huadi Zheng, Jie Shi, Tong Xiao, Zhizheng Wu","doi":"arxiv-2409.11308","DOIUrl":null,"url":null,"abstract":"In recent years, speech generation technology has advanced rapidly, fueled by\ngenerative models and large-scale training techniques. While these developments\nhave enabled the production of high-quality synthetic speech, they have also\nraised concerns about the misuse of this technology, particularly for\ngenerating synthetic misinformation. Current research primarily focuses on\ndistinguishing machine-generated speech from human-produced speech, but the\nmore urgent challenge is detecting misinformation within spoken content. This\ntask requires a thorough analysis of factors such as speaker identity, topic,\nand synthesis. To address this need, we conduct an initial investigation into\nsynthetic spoken misinformation detection by introducing an open-source\ndataset, SpMis. SpMis includes speech synthesized from over 1,000 speakers\nacross five common topics, utilizing state-of-the-art text-to-speech systems.\nAlthough our results show promising detection capabilities, they also reveal\nsubstantial challenges for practical implementation, underscoring the\nimportance of ongoing research in this critical area.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, speech generation technology has advanced rapidly, fueled by
generative models and large-scale training techniques. While these developments
have enabled the production of high-quality synthetic speech, they have also
raised concerns about the misuse of this technology, particularly for
generating synthetic misinformation. Current research primarily focuses on
distinguishing machine-generated speech from human-produced speech, but the
more urgent challenge is detecting misinformation within spoken content. This
task requires a thorough analysis of factors such as speaker identity, topic,
and synthesis. To address this need, we conduct an initial investigation into
synthetic spoken misinformation detection by introducing an open-source
dataset, SpMis. SpMis includes speech synthesized from over 1,000 speakers
across five common topics, utilizing state-of-the-art text-to-speech systems.
Although our results show promising detection capabilities, they also reveal
substantial challenges for practical implementation, underscoring the
importance of ongoing research in this critical area.