{"title":"Collaborative real-time speaker identification for wearable systems","authors":"M. Rossi, O. Amft, Martin Kusserow, G. Tröster","doi":"10.1109/PERCOM.2010.5466976","DOIUrl":null,"url":null,"abstract":"We present an unsupervised speaker identification system for personal annotations of conversations and meetings. The system dynamically learns new speakers and recognizes already known speakers using one audio channel and speech-independent modeling. Multiple personal systems could collaborate in robust unsupervised speaker identification and online learning. The system was optimized for real-time operation on a DSP system that can be worn during daily activities. The system was evaluated on the freely available 24-speaker Augmented Multiparty Interaction dataset. For 5 s recognition time, the system achieves 81% recognition rate. Collaboration between four identification systems resulted in a performance increase of up to 17%, however even two collaborating systems yield an performance improvement. A prototypical wearable DSP implementation could continuously operate for more than 8 hours from a 4.1 Ah battery.","PeriodicalId":207774,"journal":{"name":"2010 IEEE International Conference on Pervasive Computing and Communications (PerCom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Pervasive Computing and Communications (PerCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PERCOM.2010.5466976","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
We present an unsupervised speaker identification system for personal annotations of conversations and meetings. The system dynamically learns new speakers and recognizes already known speakers using one audio channel and speech-independent modeling. Multiple personal systems could collaborate in robust unsupervised speaker identification and online learning. The system was optimized for real-time operation on a DSP system that can be worn during daily activities. The system was evaluated on the freely available 24-speaker Augmented Multiparty Interaction dataset. For 5 s recognition time, the system achieves 81% recognition rate. Collaboration between four identification systems resulted in a performance increase of up to 17%, however even two collaborating systems yield an performance improvement. A prototypical wearable DSP implementation could continuously operate for more than 8 hours from a 4.1 Ah battery.