首页 > 最新文献

2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)最新文献

英文 中文
Intent Classification on Myanmar Social Media Data in Telecommunication Domain Using Convolutional Neural Network and Word2Vec 基于卷积神经网络和Word2Vec的缅甸电信领域社交媒体数据意图分类
Thet Naing Tun, K. Soe
Nowadays, people widely use social media and spend more time on that. Intentions behind users' generated content can be ranged from social good to feedbacks about the service or product of a company. With the help of deep learning models, users' intentions can classify more accurately. This paper focuses on the intent classification of users' generated comments on social media posted in Myanmar text. In this paper, Word2Vec is used to convert words into vector representations, which will be input for the Convolutional Neural Networks (CNN) to classify the users' comments to one of the pre-defined classes. Continuous Bag of Words (CBOW) architecture is used to train Word2Vec model. The proposed model's comparative experiment was performed on the baseline Recurrent Neural Network (RNN) model with a single recurrent layer. Facebook is a target social medial platform. Content from social media are domain-independent and makes it difficult to classify. So, in the proposed model, telecommunication is the target social media domain. Users' comments from that domain are regarded as feedbacks and collected as training and testing data for the model. According to the experimental result, the proposed model outperforms the average F-Score value of 0.94 over RNN.
如今,人们广泛使用社交媒体,花更多的时间在上面。用户生成内容背后的意图可以是社会公益,也可以是对公司服务或产品的反馈。在深度学习模型的帮助下,用户的意图可以更准确地分类。本文主要研究缅甸文本用户在社交媒体上发表的评论的意图分类。在本文中,使用Word2Vec将单词转换为向量表示,并将其输入卷积神经网络(CNN),将用户的评论分类到预定义的类之一。采用连续词袋(CBOW)架构对Word2Vec模型进行训练。将该模型与具有单一递归层的基线递归神经网络(RNN)模型进行对比实验。Facebook是一个目标社交媒体平台。来自社交媒体的内容是独立于领域的,因此很难分类。因此,在提出的模型中,电信是目标社交媒体领域。来自该领域的用户评论被视为反馈,并被收集为模型的训练和测试数据。实验结果表明,该模型优于RNN的平均F-Score值0.94。
{"title":"Intent Classification on Myanmar Social Media Data in Telecommunication Domain Using Convolutional Neural Network and Word2Vec","authors":"Thet Naing Tun, K. Soe","doi":"10.1109/O-COCOSDA50338.2020.9295031","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295031","url":null,"abstract":"Nowadays, people widely use social media and spend more time on that. Intentions behind users' generated content can be ranged from social good to feedbacks about the service or product of a company. With the help of deep learning models, users' intentions can classify more accurately. This paper focuses on the intent classification of users' generated comments on social media posted in Myanmar text. In this paper, Word2Vec is used to convert words into vector representations, which will be input for the Convolutional Neural Networks (CNN) to classify the users' comments to one of the pre-defined classes. Continuous Bag of Words (CBOW) architecture is used to train Word2Vec model. The proposed model's comparative experiment was performed on the baseline Recurrent Neural Network (RNN) model with a single recurrent layer. Facebook is a target social medial platform. Content from social media are domain-independent and makes it difficult to classify. So, in the proposed model, telecommunication is the target social media domain. Users' comments from that domain are regarded as feedbacks and collected as training and testing data for the model. According to the experimental result, the proposed model outperforms the average F-Score value of 0.94 over RNN.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115333328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
VOIS: The First Speech Therapy App Specifically Designed for Myanmar Hearing-Impaired Children VOIS:首个专为缅甸听障儿童设计的语言治疗应用
A. Thida, Nway Nway Han, Sheinn Thawtar Oo, Sheng Li, Chenchen Ding
The hearing-impaired children's education is challenging because they are unlikely to develop normal speech and language ability. We propose a mobile application VOIS, which is the first speech therapy application for hearing-impaired children in Myanmar. This mobile application uses a Convolutional Neural Network (CNN) based offline Burmese speech recognition system. It can help hearing-impaired children to train with the language pre-requisites at their own pace. To effectively help the hearing-impaired children to understand the basics of the language, this system provides one-syllable and two-syllable structured Myanmar words collected in real-life educational and communication materials. The experimental result shows that the prediction rate of this system is nearly 60%. Experiments also show the hearing-impaired children can learn and operate the language freely through a simple practice using this application. The expectation is that this application can bring both opportunities and life-quality improvements for children with hearing loss in Myanmar.
听障儿童的教育具有挑战性,因为他们不太可能发展正常的言语和语言能力。我们提出了一个移动应用VOIS,这是缅甸首个针对听障儿童的语言治疗应用。这个移动应用程序使用基于卷积神经网络(CNN)的离线缅甸语语音识别系统。它可以帮助听障儿童按照自己的节奏进行语言基础训练。为了有效帮助听障儿童理解缅甸语的基础知识,该系统提供了从现实生活中的教育和交流材料中收集的单音节和双音节结构的缅甸语单词。实验结果表明,该系统的预测率接近60%。实验还表明,使用该应用程序,听障儿童可以通过简单的练习自由地学习和操作语言。期望这个应用程序可以为缅甸的听力损失儿童带来机会和生活质量的改善。
{"title":"VOIS: The First Speech Therapy App Specifically Designed for Myanmar Hearing-Impaired Children","authors":"A. Thida, Nway Nway Han, Sheinn Thawtar Oo, Sheng Li, Chenchen Ding","doi":"10.1109/O-COCOSDA50338.2020.9295024","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295024","url":null,"abstract":"The hearing-impaired children's education is challenging because they are unlikely to develop normal speech and language ability. We propose a mobile application VOIS, which is the first speech therapy application for hearing-impaired children in Myanmar. This mobile application uses a Convolutional Neural Network (CNN) based offline Burmese speech recognition system. It can help hearing-impaired children to train with the language pre-requisites at their own pace. To effectively help the hearing-impaired children to understand the basics of the language, this system provides one-syllable and two-syllable structured Myanmar words collected in real-life educational and communication materials. The experimental result shows that the prediction rate of this system is nearly 60%. Experiments also show the hearing-impaired children can learn and operate the language freely through a simple practice using this application. The expectation is that this application can bring both opportunities and life-quality improvements for children with hearing loss in Myanmar.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122979468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Creation and Analysis of Emotional Speech Database for Multiple Emotions Recognition 面向多种情绪识别的情绪语音数据库的建立与分析
Ryota Sato, Ryohei Sasaki, Norisato Suga, T. Furukawa
Speech emotion recognition (SER) is one of the latest challenge in human-computer interaction. In conventional SER classification methods, a single emotion label is outputted per one utterance as the estimation result. This is because conventional speech emotional databases which are used to train SER models have a single emotion label for one utterance. However, it is often the case that multiple emotions are expressed simultaneously with different intensities in human speech. In order to realize more natural SER than ever, existence of multiple emotions in one utterance should be taken into account. Therefore, we created an emotional speech database which contains multiple emotions and their intensities labels. The creation experiment was conducted by extracting speech utterance parts where emotions appear from existing video works. In addition, we evaluated the created database by performing statistical analysis on the database. As a result, 2,025 samples were obtained, of which 1,525 samples contained multiple emotions.
语音情感识别(SER)是人机交互领域的最新挑战之一。在传统的SER分类方法中,每一个话语输出一个情感标签作为估计结果。这是因为用于训练SER模型的传统语音情感数据库对一个话语有一个单一的情感标签。然而,在人类语言中,多种情绪往往以不同的强度同时表达。为了实现比以往任何时候都更自然的SER,应该考虑到一个话语中存在多种情感。因此,我们创建了一个包含多种情绪及其强度标签的情绪语音数据库。创作实验是通过从现有的视频作品中提取出现情感的语音话语部分进行的。此外,我们通过对数据库执行统计分析来评估创建的数据库。结果,获得了2025个样本,其中1525个样本包含多种情绪。
{"title":"Creation and Analysis of Emotional Speech Database for Multiple Emotions Recognition","authors":"Ryota Sato, Ryohei Sasaki, Norisato Suga, T. Furukawa","doi":"10.1109/O-COCOSDA50338.2020.9295041","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295041","url":null,"abstract":"Speech emotion recognition (SER) is one of the latest challenge in human-computer interaction. In conventional SER classification methods, a single emotion label is outputted per one utterance as the estimation result. This is because conventional speech emotional databases which are used to train SER models have a single emotion label for one utterance. However, it is often the case that multiple emotions are expressed simultaneously with different intensities in human speech. In order to realize more natural SER than ever, existence of multiple emotions in one utterance should be taken into account. Therefore, we created an emotional speech database which contains multiple emotions and their intensities labels. The creation experiment was conducted by extracting speech utterance parts where emotions appear from existing video works. In addition, we evaluated the created database by performing statistical analysis on the database. As a result, 2,025 samples were obtained, of which 1,525 samples contained multiple emotions.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127933320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Region Report 2020 Hong Kong 2020年香港地区报告
Tan Lee
This article consists only of a collection of slides from the author's conference presentation.
本文仅由作者在会议上发表的一些幻灯片组成。
{"title":"Region Report 2020 Hong Kong","authors":"Tan Lee","doi":"10.1109/o-cocosda50338.2020.9295034","DOIUrl":"https://doi.org/10.1109/o-cocosda50338.2020.9295034","url":null,"abstract":"This article consists only of a collection of slides from the author's conference presentation.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116870938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Myanmar News Headline Generation with Sequence-to-Sequence model 缅甸新闻标题生成序列到序列模型
Yamin Thu, Win Pa Pa
News Headline generation is one of the most valuable research recently in NLP area. Generation of News headline means by learning to map articles to headlines using Sequence-to-Sequence model. Headline Generator that used an encoder and a decoder designed using Long Short-Term Memory (LSTM) was applied in this work. In this paper, an automatic headline generation for Myanmar News article using Seq2Seq model is implemented. There are various ways to generate a headline for news. In this paper, headline was generated using Seq2Seq with one-hot encoding and described about the comparative analysis results. While constructing the model, there are some challenges such as vocabulary counting and find out unknown terms in word embedding. In order to get more meaningful results, used the error analysis to typical neural headline generation system and evaluated based on machine generated headlines and actual headlines using ROUGE evaluation metric. The experiments have been conducted on Myanmar News dataset of 7000 pairs of news articles and their corresponding headlines. According to the evaluation, Seq2Seq with one-hot encoding outperforms than other Seq2Seq with word embedding (GloVe) and Recursive Recurrent Neural Network (Recursive RNN).
新闻标题生成是近年来NLP领域最有价值的研究之一。新闻标题的生成是指学习使用序列到序列模型将文章映射到标题。标题发生器采用长短期记忆(LSTM)设计的编码器和解码器。本文采用Seq2Seq模型实现了缅甸新闻文章标题的自动生成。有各种各样的方法可以产生新闻标题。本文使用单热编码的Seq2Seq生成标题,并描述了对比分析结果。在构建模型的过程中,存在词汇计数和词嵌入中未知词的发现等问题。为了得到更有意义的结果,对典型的神经标题生成系统进行误差分析,使用ROUGE评价指标对机器生成的标题和实际标题进行评价。实验在缅甸新闻的7000对新闻文章及其标题数据集上进行。根据评价,单热编码的Seq2Seq优于其他采用词嵌入(GloVe)和递归递归神经网络(Recursive Recurrent Neural Network, RNN)的Seq2Seq。
{"title":"Myanmar News Headline Generation with Sequence-to-Sequence model","authors":"Yamin Thu, Win Pa Pa","doi":"10.1109/O-COCOSDA50338.2020.9295017","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295017","url":null,"abstract":"News Headline generation is one of the most valuable research recently in NLP area. Generation of News headline means by learning to map articles to headlines using Sequence-to-Sequence model. Headline Generator that used an encoder and a decoder designed using Long Short-Term Memory (LSTM) was applied in this work. In this paper, an automatic headline generation for Myanmar News article using Seq2Seq model is implemented. There are various ways to generate a headline for news. In this paper, headline was generated using Seq2Seq with one-hot encoding and described about the comparative analysis results. While constructing the model, there are some challenges such as vocabulary counting and find out unknown terms in word embedding. In order to get more meaningful results, used the error analysis to typical neural headline generation system and evaluated based on machine generated headlines and actual headlines using ROUGE evaluation metric. The experiments have been conducted on Myanmar News dataset of 7000 pairs of news articles and their corresponding headlines. According to the evaluation, Seq2Seq with one-hot encoding outperforms than other Seq2Seq with word embedding (GloVe) and Recursive Recurrent Neural Network (Recursive RNN).","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130223270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Prosodic Information-Assisted DNN-based Mandarin Spontaneous-Speech Recognition 韵律信息辅助的基于dnn的普通话自发语音识别
Yu-Chih Deng, Cheng-Hsin Lin, Y. Liao, Yih-Ru Wang, Sin-Horng Chen
This paper continues the method proposed in [1] and updates its traditional HMM-based ASR to state-of-the-art DNN-based ASR. Use prosodic information to assist state-of-the-art DNN-based Mandarin spontaneous-speech recognition, especially to alleviate the serious interference of annoying disfluencies and paralinguistic phenomena during decoding. This approach adopts a sophisticated hierarchical prosodic model (HPM) made of several break-syntax, break-acoustic, syllable prosodic and prosodic state models to rescore and improve the TDNN-f+RNNLM-based 1st pass decoding output and generate, at the same time, the word, Part of Speech (POS), Punctuation Mark (PM), tone, break type, and prosodic state tags for further use. Experimental results showed the HPM-based system not only dramatically reduced the word error rate from previous best value: 41.8% [1] to 21.2%. It also detected well the underlying POS, PMs, and tones (10.9%, 12.6%, and 2.3% error rates were achieved, respectively). This confirms that the proposed method is very promising on tackling the task of Mandarin spontaneous-speech recognition.
本文延续了[1]中提出的方法,将传统的基于hmm的ASR更新为最先进的基于dnn的ASR。利用韵律信息辅助最先进的基于dnn的普通话自发语音识别,特别是减轻解码过程中恼人的不流利和副语言现象的严重干扰。该方法采用由断续句法、断续声学、音节韵律和韵律状态模型组成的复杂的分层韵律模型(HPM),对基于TDNN-f+ rnnlm的一遍解码输出进行重核和改进,同时生成词性、词性、标点符号、语调、断续类型和韵律状态标签,供进一步使用。实验结果表明,基于hpm的系统不仅将单词错误率从之前的最佳值41.8%[1]大幅降低到21.2%。它还可以很好地检测潜在的词性、词性和音调(分别达到10.9%、12.6%和2.3%的错误率)。这证实了所提出的方法在解决汉语自发语音识别任务方面是非常有前途的。
{"title":"Prosodic Information-Assisted DNN-based Mandarin Spontaneous-Speech Recognition","authors":"Yu-Chih Deng, Cheng-Hsin Lin, Y. Liao, Yih-Ru Wang, Sin-Horng Chen","doi":"10.1109/O-COCOSDA50338.2020.9295010","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295010","url":null,"abstract":"This paper continues the method proposed in [1] and updates its traditional HMM-based ASR to state-of-the-art DNN-based ASR. Use prosodic information to assist state-of-the-art DNN-based Mandarin spontaneous-speech recognition, especially to alleviate the serious interference of annoying disfluencies and paralinguistic phenomena during decoding. This approach adopts a sophisticated hierarchical prosodic model (HPM) made of several break-syntax, break-acoustic, syllable prosodic and prosodic state models to rescore and improve the TDNN-f+RNNLM-based 1st pass decoding output and generate, at the same time, the word, Part of Speech (POS), Punctuation Mark (PM), tone, break type, and prosodic state tags for further use. Experimental results showed the HPM-based system not only dramatically reduced the word error rate from previous best value: 41.8% [1] to 21.2%. It also detected well the underlying POS, PMs, and tones (10.9%, 12.6%, and 2.3% error rates were achieved, respectively). This confirms that the proposed method is very promising on tackling the task of Mandarin spontaneous-speech recognition.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115361840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Japanese Quotation Marker “tte” in Conversation using Everyday Conversation Corpus* 日常会话语料库中的日语引文标记“tte”*
Yasuyuki Usuda
This study investigates what participants achieve by employing a Japanese quotation marker “tte” in everyday conversation. Specifically, the focus is what is done in the utterance when the marker is at the end of the utterance. In the cases in this study, the direct quotations with the end of “tte” and similar ones are employed in telling environment, in which one side of participants keeps the right to speak to tell something about their thought or experience, and the others just response of listening to it. It is found that the types of the quotation are employed in the middle of the telling and the contents of the quotation can be seen as a punchline though it is not. Thus, the teller have to deal with the possibility of misunderstanding that the quotation is the punchline of the story. Therefore, it can be said that quotation with “tte” enables receivers of telling to understand that the utterance is not the end of the telling. This contributes the co-construction of the story and mutual understanding of the status of the telling going on.
本研究调查了参与者在日常会话中使用日语引号“tte”所达到的效果。具体来说,重点是当标记在话语末尾时,话语中发生了什么。在本研究案例中,以“tte”结尾的直接引语和类似的引语被用于讲述环境,在这种环境中,一边的参与者保留说话的权利来讲述他们的想法或经历,而另一边的参与者只是倾听的反应。我们发现,引文的类型是在讲述的过程中使用的,引文的内容虽然不是点睛之笔,但可以看作是点睛之笔。因此,讲话者必须应对误解的可能性,即引语是故事的点睛之笔。因此,可以说,用“tte”来引语可以让话语的接受者明白,话语并不是话语的结束。这有助于故事的共同构建和对正在进行的讲述状态的相互理解。
{"title":"Japanese Quotation Marker “tte” in Conversation using Everyday Conversation Corpus*","authors":"Yasuyuki Usuda","doi":"10.1109/O-COCOSDA50338.2020.9295029","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295029","url":null,"abstract":"This study investigates what participants achieve by employing a Japanese quotation marker “tte” in everyday conversation. Specifically, the focus is what is done in the utterance when the marker is at the end of the utterance. In the cases in this study, the direct quotations with the end of “tte” and similar ones are employed in telling environment, in which one side of participants keeps the right to speak to tell something about their thought or experience, and the others just response of listening to it. It is found that the types of the quotation are employed in the middle of the telling and the contents of the quotation can be seen as a punchline though it is not. Thus, the teller have to deal with the possibility of misunderstanding that the quotation is the punchline of the story. Therefore, it can be said that quotation with “tte” enables receivers of telling to understand that the utterance is not the end of the telling. This contributes the co-construction of the story and mutual understanding of the status of the telling going on.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134389797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vietnam Country Report 2020: Updated activities on resources development for Vietnamese Speech and NLP 2020年越南国家报告:越南语言和NLP资源开发的最新活动
This article consists only of a collection of slides from the author's conference presentation.
本文仅由作者在会议上发表的一些幻灯片组成。
{"title":"Vietnam Country Report 2020: Updated activities on resources development for Vietnamese Speech and NLP","authors":"","doi":"10.1109/o-cocosda50338.2020.9295028","DOIUrl":"https://doi.org/10.1109/o-cocosda50338.2020.9295028","url":null,"abstract":"This article consists only of a collection of slides from the author's conference presentation.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132069612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A pilot study on the perception of Chinese and English prosodic focus by Chinese learners of English: the effect of foreign accent 中国英语学习者对英汉韵律焦点感知的初步研究:外国口音的影响
Danhong Shen, Ping Tang
Prosodic focus plays an important role in daily speech communication, typically marking the key information in utterances, such as the new information. It was found that English speakers utilize focus to differentiate between new and old information in perception, while it was unclear if Chinese learners of English were able to do so when perceiving prosodic focus in Mandarin Chinese (L1) and English (L2). Moreover, earlier studies showed that, native speakers show adaptation to foreign-accented semantic or syntactic infelicity, while it was unclear whether they show similar adaption to infelicitous prosodic focus. Therefore, the current (pilot) study explored (1) whether Chinese L2 learners were able to utilize prosodic focus to differentiate between new and old information when perceiving Chinese and English utterances, and (2) whether they show adaptation to infelicitous prosodic focus when hearing foreign accent. Twelve English major students were recruited as participants. Audio materials included Chinese and English utterances with felicitous, neutral and infelicitous focus conditions, produced by native speakers (without foreign accents) and L2 learners (with foreign accent). Visual-world paradigm was adopted to record the reaction time and eye movement. The results showed that Chinese L2 learners were able to utilize prosodic focus to differentiate between new and old information when hearing both Chinese and English utterances, showing fast response in felicitous focus condition. However, when hearing foreign accent, they did not utilize prosodic focus to perceive new/old information, showing adaptation to infelicitous focus. These (preliminary) results indicate that L2 learners can accurately perceive prosodic focus in English. It also implies that there, when hearing utterance produced with foreign-accent, listeners show adaptation to not only semantic and syntactic infelicity, but also prosodic infelicity.
韵律焦点在日常言语交际中起着重要的作用,它通常标志着话语中的关键信息,如新信息。研究发现,英语使用者在感知中利用焦点来区分新旧信息,而中国英语学习者在汉语(第一语言)和英语(第二语言)的韵律焦点感知中是否能够做到这一点尚不清楚。此外,早期的研究表明,母语人士对外国口音的语义或句法不连贯表现出适应,而他们对不连贯的韵律焦点是否表现出类似的适应尚不清楚。因此,本(试点)研究探讨了(1)中国第二语言学习者在感知汉语和英语话语时是否能够利用韵律焦点来区分新旧信息;(2)他们在听到外国口音时是否表现出对不恰当韵律焦点的适应。我们招募了12名英语专业的学生作为研究对象。音频材料包括由母语人士(不带外国口音)和第二语言学习者(带外国口音)制作的汉语和英语话语,这些话语具有恰当的、中性的和不恰当的焦点条件。采用视觉世界范式记录反应时间和眼动。结果表明,中国的第二语言学习者在听汉语和英语话语时都能够利用韵律焦点来区分新旧信息,在恰当的焦点条件下表现出快速的反应。然而,当听到外国口音时,他们没有利用韵律焦点来感知新/旧信息,表现出对不恰当焦点的适应。这些(初步)结果表明,二语学习者可以准确地感知英语的韵律焦点。这也意味着听者在听到外国口音的话语时,不仅对语义和句法上的不实表现出适应,而且对韵律上的不实也表现出适应。
{"title":"A pilot study on the perception of Chinese and English prosodic focus by Chinese learners of English: the effect of foreign accent","authors":"Danhong Shen, Ping Tang","doi":"10.1109/O-COCOSDA50338.2020.9295003","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295003","url":null,"abstract":"Prosodic focus plays an important role in daily speech communication, typically marking the key information in utterances, such as the new information. It was found that English speakers utilize focus to differentiate between new and old information in perception, while it was unclear if Chinese learners of English were able to do so when perceiving prosodic focus in Mandarin Chinese (L1) and English (L2). Moreover, earlier studies showed that, native speakers show adaptation to foreign-accented semantic or syntactic infelicity, while it was unclear whether they show similar adaption to infelicitous prosodic focus. Therefore, the current (pilot) study explored (1) whether Chinese L2 learners were able to utilize prosodic focus to differentiate between new and old information when perceiving Chinese and English utterances, and (2) whether they show adaptation to infelicitous prosodic focus when hearing foreign accent. Twelve English major students were recruited as participants. Audio materials included Chinese and English utterances with felicitous, neutral and infelicitous focus conditions, produced by native speakers (without foreign accents) and L2 learners (with foreign accent). Visual-world paradigm was adopted to record the reaction time and eye movement. The results showed that Chinese L2 learners were able to utilize prosodic focus to differentiate between new and old information when hearing both Chinese and English utterances, showing fast response in felicitous focus condition. However, when hearing foreign accent, they did not utilize prosodic focus to perceive new/old information, showing adaptation to infelicitous focus. These (preliminary) results indicate that L2 learners can accurately perceive prosodic focus in English. It also implies that there, when hearing utterance produced with foreign-accent, listeners show adaptation to not only semantic and syntactic infelicity, but also prosodic infelicity.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123628925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Part-Of-Speech Tagger in Malayalam Using Bi-directional LSTM 基于双向LSTM的马拉雅拉姆语词性标注器
R. Rajan, Anna J. Joseph, Elizabeth K. Robin, Nishma T. K. Fathima
The majority of activities performed by humans are done through language, whether communicated directly or reported using natural language. As technology is increasingly making the methods and platforms on which we communicate ever more accessible, there is a great need to understand the languages we use to communicate. By combining the power of artificial intelligence, computational linguistics and computer science, natural language processing (NLP) helps machines “read” text by simulating the human ability to understand language. Part-of-speech tagging (POS Tagging) is done as a pre-requisite to simplify a lot of different NLP applications like question answering, speech recognition, machine translation, and so on. Here, we attempt a comparison between part-of-speech taggers in Malayalam using decision tree algorithm and bi-directional long short term memory (BLSTM). The experiments presented in this paper use two corpora, one of 29076 sentences and the other of 500 sentences for performance evaluation. The experiments demonstrate the potential of architectural choice of BLSTM-based tagger over conventional decision tree-based tagging in Malayalam.
人类进行的大多数活动都是通过语言完成的,无论是直接交流还是使用自然语言报告。随着技术的发展,我们交流的方法和平台越来越容易获得,我们非常需要了解我们用来交流的语言。通过结合人工智能、计算语言学和计算机科学的力量,自然语言处理(NLP)通过模拟人类理解语言的能力来帮助机器“阅读”文本。词性标注(POS tagging)是简化许多不同的NLP应用程序(如问答、语音识别、机器翻译等)的先决条件。在这里,我们尝试使用决策树算法和双向长短期记忆(BLSTM)对马拉雅拉姆语的词性标注器进行比较。本文的实验使用两个语料库,一个包含29076个句子,另一个包含500个句子进行性能评估。实验证明了基于blstm的标注器在马来亚拉姆语中比传统的基于决策树的标注器在架构选择上的潜力。
{"title":"Part-Of-Speech Tagger in Malayalam Using Bi-directional LSTM","authors":"R. Rajan, Anna J. Joseph, Elizabeth K. Robin, Nishma T. K. Fathima","doi":"10.1109/o-cocosda50338.2020.9295018","DOIUrl":"https://doi.org/10.1109/o-cocosda50338.2020.9295018","url":null,"abstract":"The majority of activities performed by humans are done through language, whether communicated directly or reported using natural language. As technology is increasingly making the methods and platforms on which we communicate ever more accessible, there is a great need to understand the languages we use to communicate. By combining the power of artificial intelligence, computational linguistics and computer science, natural language processing (NLP) helps machines “read” text by simulating the human ability to understand language. Part-of-speech tagging (POS Tagging) is done as a pre-requisite to simplify a lot of different NLP applications like question answering, speech recognition, machine translation, and so on. Here, we attempt a comparison between part-of-speech taggers in Malayalam using decision tree algorithm and bi-directional long short term memory (BLSTM). The experiments presented in this paper use two corpora, one of 29076 sentences and the other of 500 sentences for performance evaluation. The experiments demonstrate the potential of architectural choice of BLSTM-based tagger over conventional decision tree-based tagging in Malayalam.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122185136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1