{"title":"基于稀疏源检测的联合短时说话人识别与跟踪","authors":"Yao Guo, Hongyan Zhu","doi":"10.1051/aacus/2023004","DOIUrl":null,"url":null,"abstract":"A random finite set-based sequential Monte–Carlo tracking method is proposed to track multiple acoustic sources in indoor scenarios. The proposed method can improve tracking performance by introducing recognized speaker identities from the received signals. At the front-end, the degenerate unmixing estimation technique (DUET) is employed to separate the mixed signals, and the time delay of arrival (TDOA) is measured. In addition, a criterion to select the reliable microphone pair is designed to quickly obtain accurate speaker identities from the mixed signals, and the Gaussian mixture model universal background model (GMM-UBM) is employed to train the speaker model. In the tracking step, the update of the weight for each particle is derived after introducing the recognized speaker identities, which results in better association between the measurements and sources. Simulation results demonstrate that the proposed method can improve the accuracy of the filter states and discriminate the sources close to each other.","PeriodicalId":48486,"journal":{"name":"Acta Acustica","volume":"11 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Joint short-time speaker recognition and tracking using sparsity-based source detection\",\"authors\":\"Yao Guo, Hongyan Zhu\",\"doi\":\"10.1051/aacus/2023004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A random finite set-based sequential Monte–Carlo tracking method is proposed to track multiple acoustic sources in indoor scenarios. The proposed method can improve tracking performance by introducing recognized speaker identities from the received signals. At the front-end, the degenerate unmixing estimation technique (DUET) is employed to separate the mixed signals, and the time delay of arrival (TDOA) is measured. In addition, a criterion to select the reliable microphone pair is designed to quickly obtain accurate speaker identities from the mixed signals, and the Gaussian mixture model universal background model (GMM-UBM) is employed to train the speaker model. In the tracking step, the update of the weight for each particle is derived after introducing the recognized speaker identities, which results in better association between the measurements and sources. Simulation results demonstrate that the proposed method can improve the accuracy of the filter states and discriminate the sources close to each other.\",\"PeriodicalId\":48486,\"journal\":{\"name\":\"Acta Acustica\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Acta Acustica\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1051/aacus/2023004\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ACOUSTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Acustica","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1051/aacus/2023004","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ACOUSTICS","Score":null,"Total":0}
Joint short-time speaker recognition and tracking using sparsity-based source detection
A random finite set-based sequential Monte–Carlo tracking method is proposed to track multiple acoustic sources in indoor scenarios. The proposed method can improve tracking performance by introducing recognized speaker identities from the received signals. At the front-end, the degenerate unmixing estimation technique (DUET) is employed to separate the mixed signals, and the time delay of arrival (TDOA) is measured. In addition, a criterion to select the reliable microphone pair is designed to quickly obtain accurate speaker identities from the mixed signals, and the Gaussian mixture model universal background model (GMM-UBM) is employed to train the speaker model. In the tracking step, the update of the weight for each particle is derived after introducing the recognized speaker identities, which results in better association between the measurements and sources. Simulation results demonstrate that the proposed method can improve the accuracy of the filter states and discriminate the sources close to each other.
期刊介绍:
Acta Acustica, the Journal of the European Acoustics Association (EAA).
After the publication of its Journal Acta Acustica from 1993 to 1995, the EAA published Acta Acustica united with Acustica from 1996 to 2019. From 2020, the EAA decided to publish a journal in full Open Access. See Article Processing charges.
Acta Acustica reports on original scientific research in acoustics and on engineering applications. The journal considers review papers, scientific papers, technical and applied papers, short communications, letters to the editor. From time to time, special issues and review articles are also published. For book reviews or doctoral thesis abstracts, please contact the Editor in Chief.