F. Eyben, Bernd Huber, E. Marchi, Dagmar M. Schuller, Björn Schuller
{"title":"在移动平台上实时鲁棒识别说话人的情绪和特征","authors":"F. Eyben, Bernd Huber, E. Marchi, Dagmar M. Schuller, Björn Schuller","doi":"10.1109/ACII.2015.7344658","DOIUrl":null,"url":null,"abstract":"We demonstrate audEERING's sensAI technology running natively on low-resource mobile devices applied to emotion analytics and speaker characterisation tasks. A showcase application for the Android platform is provided, where au-dEERING's highly noise robust voice activity detection based on Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) is combined with our core emotion recognition and speaker characterisation engine natively on the mobile device. This eliminates the need for network connectivity and allows to perform robust speaker state and trait recognition efficiently in real-time without network transmission lags. Real-time factors are benchmarked for a popular mobile device to demonstrate the efficiency, and average response times are compared to a server based approach. The output of the emotion analysis is visualized graphically in the arousal and valence space alongside the emotion category and further speaker characteristics.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"28 1","pages":"778-780"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Real-time robust recognition of speakers' emotions and characteristics on mobile platforms\",\"authors\":\"F. Eyben, Bernd Huber, E. Marchi, Dagmar M. Schuller, Björn Schuller\",\"doi\":\"10.1109/ACII.2015.7344658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We demonstrate audEERING's sensAI technology running natively on low-resource mobile devices applied to emotion analytics and speaker characterisation tasks. A showcase application for the Android platform is provided, where au-dEERING's highly noise robust voice activity detection based on Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) is combined with our core emotion recognition and speaker characterisation engine natively on the mobile device. This eliminates the need for network connectivity and allows to perform robust speaker state and trait recognition efficiently in real-time without network transmission lags. Real-time factors are benchmarked for a popular mobile device to demonstrate the efficiency, and average response times are compared to a server based approach. The output of the emotion analysis is visualized graphically in the arousal and valence space alongside the emotion category and further speaker characteristics.\",\"PeriodicalId\":6863,\"journal\":{\"name\":\"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)\",\"volume\":\"28 1\",\"pages\":\"778-780\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACII.2015.7344658\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2015.7344658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Real-time robust recognition of speakers' emotions and characteristics on mobile platforms
We demonstrate audEERING's sensAI technology running natively on low-resource mobile devices applied to emotion analytics and speaker characterisation tasks. A showcase application for the Android platform is provided, where au-dEERING's highly noise robust voice activity detection based on Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) is combined with our core emotion recognition and speaker characterisation engine natively on the mobile device. This eliminates the need for network connectivity and allows to perform robust speaker state and trait recognition efficiently in real-time without network transmission lags. Real-time factors are benchmarked for a popular mobile device to demonstrate the efficiency, and average response times are compared to a server based approach. The output of the emotion analysis is visualized graphically in the arousal and valence space alongside the emotion category and further speaker characteristics.