Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587430
Serban Mihalache, D. Burileanu, C. Burileanu
Speech stress detection remains an important research area, with applicability to fields and tasks such as remote monitoring, virtual assistance software, forensics operations, and even health and safety. This paper proposes a deep learning system, based on multiple Deep Neural Networks (DNNs) joined within an ensemble one-vs-one (OvO) classification strategy, using an extensive set of algorithmically extracted acoustic, prosodic, spectral, and cepstral features. The system was tested on the Speech Under Simulated and Actual Stress (SUSAS) database, for 5 class subsets and groups. Improvements have been obtained over previously reported results, with an unweighted accuracy (UA) between 62.4% and 76.1%, depending on the number of classes and their grouping.
{"title":"Detecting Psychological Stress from Speech using Deep Neural Networks and Ensemble Classifiers","authors":"Serban Mihalache, D. Burileanu, C. Burileanu","doi":"10.1109/sped53181.2021.9587430","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587430","url":null,"abstract":"Speech stress detection remains an important research area, with applicability to fields and tasks such as remote monitoring, virtual assistance software, forensics operations, and even health and safety. This paper proposes a deep learning system, based on multiple Deep Neural Networks (DNNs) joined within an ensemble one-vs-one (OvO) classification strategy, using an extensive set of algorithmically extracted acoustic, prosodic, spectral, and cepstral features. The system was tested on the Speech Under Simulated and Actual Stress (SUSAS) database, for 5 class subsets and groups. Improvements have been obtained over previously reported results, with an unweighted accuracy (UA) between 62.4% and 76.1%, depending on the number of classes and their grouping.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124824149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587446
Roxana Mihaescu, Mihai Chindea, S. Carata, M. Ghenescu, C. Paleologu
The problem of re-identification involves the association of the appearances of a person caught with one or more surveillance cameras. This task is especially challenging in very crowded areas, where possible occlusions of people can drastically reduce visibility. In this paper, we aim to obtain a fully automatic re-identification system containing a stage of detection of persons before the stage of re-identification. Both stages are based on a general-purpose DNN (Deep Neural Network) object detector - the YOLO (You Only Look Once) model. The primary purpose and novelty of the proposed method are to obtain an autonomous re-identification system, starting from a simple detection model. Thus, with minimal computational and hardware resources, the proposed method leads to comparable results with other existing methods, even when running in real-time on multiple security cameras.
{"title":"A Fully Autonomous Person Re-Identification System","authors":"Roxana Mihaescu, Mihai Chindea, S. Carata, M. Ghenescu, C. Paleologu","doi":"10.1109/sped53181.2021.9587446","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587446","url":null,"abstract":"The problem of re-identification involves the association of the appearances of a person caught with one or more surveillance cameras. This task is especially challenging in very crowded areas, where possible occlusions of people can drastically reduce visibility. In this paper, we aim to obtain a fully automatic re-identification system containing a stage of detection of persons before the stage of re-identification. Both stages are based on a general-purpose DNN (Deep Neural Network) object detector - the YOLO (You Only Look Once) model. The primary purpose and novelty of the proposed method are to obtain an autonomous re-identification system, starting from a simple detection model. Thus, with minimal computational and hardware resources, the proposed method leads to comparable results with other existing methods, even when running in real-time on multiple security cameras.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129447461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587347
H. Teodorescu, S. Andrian, C. Ghiorghe, Ș. Gheltu, Ionuţ Tărăboanţă
We describe a method and the initial phase of building a database for investigating the combined effects of schizophrenia and drug-induced salivary flow alterations on speech and, based on the preliminary results, we propose a quantitative method of assessing these effects. We believe that this is the first attempt to conduct a systematic study with these two causes combined and with narrow focusing on speech changes related to fricatives.
{"title":"Speech under combined Schizophrenia and Salivary Flow Alterations – Preliminary Data and Results","authors":"H. Teodorescu, S. Andrian, C. Ghiorghe, Ș. Gheltu, Ionuţ Tărăboanţă","doi":"10.1109/sped53181.2021.9587347","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587347","url":null,"abstract":"We describe a method and the initial phase of building a database for investigating the combined effects of schizophrenia and drug-induced salivary flow alterations on speech and, based on the preliminary results, we propose a quantitative method of assessing these effects. We believe that this is the first attempt to conduct a systematic study with these two causes combined and with narrow focusing on speech changes related to fricatives.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130894458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587371
Alexandru-George Rusu, Laura-Maria Dogariu, S. Ciochină, C. Paleologu
The least-mean-square (LMS) type algorithms are widely spread in signal processing, especially in the system identification context. The classic LMS algorithm has major drawbacks due to the fixed step-size that limits the overall performance. The optimized LMS (LMSO) algorithm followed an optimization criterion and introduced a variable step-size so that it overcomes the drawbacks of the LMS algorithm. Some scenarios where the unknown system changes have highlighted the need for the LMSO algorithm to improve how fast it models the new system. In this paper, we apply the data-reuse approach for the LMSO algorithm aiming to increase the convergence rate. The simulations outline the performance improvement for the data-reuse method in combination with the LMSO algorithm.
{"title":"A Data-Reuse Approach for an Optimized LMS Algorithm","authors":"Alexandru-George Rusu, Laura-Maria Dogariu, S. Ciochină, C. Paleologu","doi":"10.1109/sped53181.2021.9587371","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587371","url":null,"abstract":"The least-mean-square (LMS) type algorithms are widely spread in signal processing, especially in the system identification context. The classic LMS algorithm has major drawbacks due to the fixed step-size that limits the overall performance. The optimized LMS (LMSO) algorithm followed an optimization criterion and introduced a variable step-size so that it overcomes the drawbacks of the LMS algorithm. Some scenarios where the unknown system changes have highlighted the need for the LMSO algorithm to improve how fast it models the new system. In this paper, we apply the data-reuse approach for the LMSO algorithm aiming to increase the convergence rate. The simulations outline the performance improvement for the data-reuse method in combination with the LMSO algorithm.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117335968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587366
S. Kulkarni, Luis Barbado, Jordan Hosier, Yu Zhou, Siddharth Rajagopalan, V. Gurbani
Sentiment analysis is an important area of natural language processing (NLP) research, and is increasingly being performed by machine learning models. Much of the work in this area is concentrated on extracting sentiment from textual data sources. Clearly however, a textual source does not convey the pitch, prosody, or power of the spoken sentiment, making it attractive to extract sentiments from an audio stream. A fundamental prerequisite for sentiment analysis on audio streams is the availability of reliable acoustic representation of sentiment, appropriately labeled. The lack of an existing, large-scale dataset in this form forces researchers to curate audio datasets from a variety of sources, often by manually labeling the audio corpus. However, this approach is inherently subjective. What appears “positive” to one human listener may appear “neutral” to another. Such challenges yield sub-optimal datasets that are often class imbalanced, and the inevitable biases present in the labeling process can permeate these models in problematic ways. To mitigate these disadvantages, we propose the use of a text-to-speech (TTS) engine to generate labeled synthetic voice samples rendered in one of three sentiments: positive, negative, or neutral. The advantage of using a TTS engine is that it can be abstracted as a function that generates an infinite set of labeled samples, on which a sentiment detection model can be trained. We investigate, in particular, the extent to which such training exhibits acceptable accuracy when the induced model is tested on a separate, independent and identically distributed speech source (i.e., the test dataset is not drawn from the same distribution as the training dataset). Our results indicate that this approach shows promise and the induced model does not suffer from underspecification.
{"title":"Project Vāc: Can a Text-to-Speech Engine Generate Human Sentiments?","authors":"S. Kulkarni, Luis Barbado, Jordan Hosier, Yu Zhou, Siddharth Rajagopalan, V. Gurbani","doi":"10.1109/sped53181.2021.9587366","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587366","url":null,"abstract":"Sentiment analysis is an important area of natural language processing (NLP) research, and is increasingly being performed by machine learning models. Much of the work in this area is concentrated on extracting sentiment from textual data sources. Clearly however, a textual source does not convey the pitch, prosody, or power of the spoken sentiment, making it attractive to extract sentiments from an audio stream. A fundamental prerequisite for sentiment analysis on audio streams is the availability of reliable acoustic representation of sentiment, appropriately labeled. The lack of an existing, large-scale dataset in this form forces researchers to curate audio datasets from a variety of sources, often by manually labeling the audio corpus. However, this approach is inherently subjective. What appears “positive” to one human listener may appear “neutral” to another. Such challenges yield sub-optimal datasets that are often class imbalanced, and the inevitable biases present in the labeling process can permeate these models in problematic ways. To mitigate these disadvantages, we propose the use of a text-to-speech (TTS) engine to generate labeled synthetic voice samples rendered in one of three sentiments: positive, negative, or neutral. The advantage of using a TTS engine is that it can be abstracted as a function that generates an infinite set of labeled samples, on which a sentiment detection model can be trained. We investigate, in particular, the extent to which such training exhibits acceptable accuracy when the induced model is tested on a separate, independent and identically distributed speech source (i.e., the test dataset is not drawn from the same distribution as the training dataset). Our results indicate that this approach shows promise and the induced model does not suffer from underspecification.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123177385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587400
Andy Huang, Kyle Hall, C. Watson, Seyed Reza Shahamiri
Automated dysarthria intelligibility assessment offers the opportunity to develop reliable, low-cost, and scalable tools, which help to solve current shortcomings of manual and subjective intelligibility assessments. This paper reviews the literature regarding automated intelligibility assessment, identifying the highest performing published models and concluding on promising avenues for further research. Our review shows that most of the existing work were able to achieve very high accuracies. However, we have found that most of these studies validated their models using speech samples of the same speakers used in training, making their results less generalizable. Furthermore, there is a lack of study on how well these models perform on speakers from different datasets or different microphone setups. This lack of generalizability has implications to the real-life application of these models. Future research directions could include the use of more robust methods of validation such as using unseen speakers, as well as incorporating speakers from different datasets. This would provide confidence that the models are generalized and therefore allow them to be used in real-world clinical practice.
{"title":"A Review of Automated Intelligibility Assessment for Dysarthric Speakers","authors":"Andy Huang, Kyle Hall, C. Watson, Seyed Reza Shahamiri","doi":"10.1109/sped53181.2021.9587400","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587400","url":null,"abstract":"Automated dysarthria intelligibility assessment offers the opportunity to develop reliable, low-cost, and scalable tools, which help to solve current shortcomings of manual and subjective intelligibility assessments. This paper reviews the literature regarding automated intelligibility assessment, identifying the highest performing published models and concluding on promising avenues for further research. Our review shows that most of the existing work were able to achieve very high accuracies. However, we have found that most of these studies validated their models using speech samples of the same speakers used in training, making their results less generalizable. Furthermore, there is a lack of study on how well these models perform on speakers from different datasets or different microphone setups. This lack of generalizability has implications to the real-life application of these models. Future research directions could include the use of more robust methods of validation such as using unseen speakers, as well as incorporating speakers from different datasets. This would provide confidence that the models are generalized and therefore allow them to be used in real-world clinical practice.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132721819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587353
Camelia-Georgiana Stativă, Adrian Iftene, Camelia-Maria Milut
This paper proposes a smartphone application meant to be used in the process of learning a new language. Our application introduces to its users a series of exercises oriented towards word reproduction, aiming to enhance one’s vocabulary alongside improving the pronunciation, being capable to indicate the flaws in the user’s utterance. The targeted users are Romanian language speakers willing to learn and practice English, with profiles for both children (beginners) and adults (advanced). The core of the application is the pronunciation module. It will be presented with two methods of analysing the accuracy of the pronunciation and the benefits and disadvantages brought by each of them. The users of this application will take advantage of two important factors involved in the process of studying a foreign language: applying it and receiving continuous feedback.
{"title":"Assessment of Pronunciation in Language Learning Applications","authors":"Camelia-Georgiana Stativă, Adrian Iftene, Camelia-Maria Milut","doi":"10.1109/sped53181.2021.9587353","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587353","url":null,"abstract":"This paper proposes a smartphone application meant to be used in the process of learning a new language. Our application introduces to its users a series of exercises oriented towards word reproduction, aiming to enhance one’s vocabulary alongside improving the pronunciation, being capable to indicate the flaws in the user’s utterance. The targeted users are Romanian language speakers willing to learn and practice English, with profiles for both children (beginners) and adults (advanced). The core of the application is the pronunciation module. It will be presented with two methods of analysing the accuracy of the pronunciation and the benefits and disadvantages brought by each of them. The users of this application will take advantage of two important factors involved in the process of studying a foreign language: applying it and receiving continuous feedback.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131035139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587396
Oana-Mariana Novac, Stefan-Adrian Toma, Emil Bureaca
One of the applications of speaker recognition technologies is in the forensics field. It is reasonable to assume that target speakers are not always cooperating, i.e., there are no available recordings, and even if they are, they are not always in the language for which the speaker was enrolled. In this study we present a set of experiments with an identity vector speaker recognition system, trained and tested with a Romanian language corpus (RoDigits), along with an assessment of its performance when there’s mismatch between training and testing language.
{"title":"Speaker Verification Experiments using Identity Vectors, on a Romanian Speakers Corpus","authors":"Oana-Mariana Novac, Stefan-Adrian Toma, Emil Bureaca","doi":"10.1109/sped53181.2021.9587396","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587396","url":null,"abstract":"One of the applications of speaker recognition technologies is in the forensics field. It is reasonable to assume that target speakers are not always cooperating, i.e., there are no available recordings, and even if they are, they are not always in the language for which the speaker was enrolled. In this study we present a set of experiments with an identity vector speaker recognition system, trained and tested with a Romanian language corpus (RoDigits), along with an assessment of its performance when there’s mismatch between training and testing language.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117299290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587383
Alexandru-Lucian Georgescu, H. Cucu, C. Burileanu
Automatic speech recognition (ASR) for Romanian language is on an ascending trend of interest for the scientific community. In the last two years several research groups reported valuable results on speech recognition and dialogue tasks for Romanian. In our paper we present the improvements we recently obtained by collecting and using more text and audio data for training the language and acoustic models. We emphasize the automatic methodologies employed to facilitate data collection and annotation. In comparison to our previous work, we report state-of-the-art results for read speech (WER of 1.6%) and significantly better results on spontaneous speech: relative improvement around 40%). In order to facilitate direct comparison with other ASR systems, we release all evaluation datasets, totaling 10 hours of manually annotated speech.
{"title":"Improvements of SpeeD’s Romanian ASR system during ReTeRom project","authors":"Alexandru-Lucian Georgescu, H. Cucu, C. Burileanu","doi":"10.1109/sped53181.2021.9587383","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587383","url":null,"abstract":"Automatic speech recognition (ASR) for Romanian language is on an ascending trend of interest for the scientific community. In the last two years several research groups reported valuable results on speech recognition and dialogue tasks for Romanian. In our paper we present the improvements we recently obtained by collecting and using more text and audio data for training the language and acoustic models. We emphasize the automatic methodologies employed to facilitate data collection and annotation. In comparison to our previous work, we report state-of-the-art results for read speech (WER of 1.6%) and significantly better results on spontaneous speech: relative improvement around 40%). In order to facilitate direct comparison with other ASR systems, we release all evaluation datasets, totaling 10 hours of manually annotated speech.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124816764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-13DOI: 10.1109/sped53181.2021.9587358
D. Jitca
The paper proposes a cognitive view on intonational contours aiming to describe their partitions in terms of elementary cognitive categories related to a generic information packaging (IPk) mechanism. We formulate the hypothesis that IPk structures pack auditory information items at the cortical level into relations that are marked at the utterance level by prosodic phrases. An IPk model based on this hypothesis is used in the paper for describing two pairs of contours that are presented in [1] as problematic for a categorical phonological description. The paper proposes a cognitive description of the respective contours after their partitioning into hierarchies of nested IPk units. In this view phonological events become marks of functional constituents at the cognitive level and the semantic differences between contours are reflected by their structural cognitive differences.
{"title":"A Cognitive View on Intonational Meaning","authors":"D. Jitca","doi":"10.1109/sped53181.2021.9587358","DOIUrl":"https://doi.org/10.1109/sped53181.2021.9587358","url":null,"abstract":"The paper proposes a cognitive view on intonational contours aiming to describe their partitions in terms of elementary cognitive categories related to a generic information packaging (IPk) mechanism. We formulate the hypothesis that IPk structures pack auditory information items at the cortical level into relations that are marked at the utterance level by prosodic phrases. An IPk model based on this hypothesis is used in the paper for describing two pairs of contours that are presented in [1] as problematic for a categorical phonological description. The paper proposes a cognitive description of the respective contours after their partitioning into hierarchies of nested IPk units. In this view phonological events become marks of functional constituents at the cognitive level and the semantic differences between contours are reflected by their structural cognitive differences.","PeriodicalId":193702,"journal":{"name":"2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125083674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}