首页 > 最新文献

2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)最新文献

英文 中文
Contextual Lexicon Based Sentiment Analysis in Myanmar Text Reviews 基于语境词典的缅甸语文本评论情感分析
Yu Mon Aye, Sint Sint Aung
A lot of information related to several commercial application available online which can be used to provide the guidance and suggestions to possible new customers. People desire to distribute the opinions and state the sentiments in their own language. Sentiment analyzers developed for English language, are not workable for Myanmar language. Mining sentiments in Myanmar text come with a lot of issues and challenges. The direction of the sentiment is highly depend on the context of sentiment text. Thus, it is significant challenge to consider contextual lexical information in order to correctly classify the polarity. This paper aims to improve the existing challenges problem of language and analyze the sentiment classification of food and restaurants domain by using contextual analysis with lexicon based approach in Myanmar text reviews. The effect of intensifier, negations and objective words are important role in the context of sentiment orientation. This paper addresses sentiment classification for Myanmar Language and overcome one of the problems of language specific challenges. The accuracy of the proposed system is higher than the classification without using context information (negation, intensifier and objective words). Overall accuracy of the proposed system is 92% and weighted average F-measure for imbalance class of 1200 reviews is 0.93.
网上有很多与商业应用相关的信息,可以为潜在的新客户提供指导和建议。人们希望用自己的语言传播观点和表达情感。为英语开发的情感分析工具不适用于缅甸语。缅甸文本中的矿业情绪伴随着许多问题和挑战。情感的方向在很大程度上取决于情感文本的语境。因此,考虑语境词汇信息以正确分类极性是一项重大挑战。本文旨在改进缅甸语文本评论中存在的语言挑战问题,利用基于词汇的上下文分析方法对食品和餐馆领域的情感分类进行分析。强化语、否定语和客观词在情感倾向语境中起着重要的作用。本文针对缅甸语的情感分类问题,克服了语言特殊性的挑战之一。与不使用语境信息(否定、强化词和客观词)的分类相比,该分类系统的准确率更高。该系统的总体准确率为92%,1200条评论的不平衡等级的加权平均f值为0.93。
{"title":"Contextual Lexicon Based Sentiment Analysis in Myanmar Text Reviews","authors":"Yu Mon Aye, Sint Sint Aung","doi":"10.1109/O-COCOSDA50338.2020.9295012","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295012","url":null,"abstract":"A lot of information related to several commercial application available online which can be used to provide the guidance and suggestions to possible new customers. People desire to distribute the opinions and state the sentiments in their own language. Sentiment analyzers developed for English language, are not workable for Myanmar language. Mining sentiments in Myanmar text come with a lot of issues and challenges. The direction of the sentiment is highly depend on the context of sentiment text. Thus, it is significant challenge to consider contextual lexical information in order to correctly classify the polarity. This paper aims to improve the existing challenges problem of language and analyze the sentiment classification of food and restaurants domain by using contextual analysis with lexicon based approach in Myanmar text reviews. The effect of intensifier, negations and objective words are important role in the context of sentiment orientation. This paper addresses sentiment classification for Myanmar Language and overcome one of the problems of language specific challenges. The accuracy of the proposed system is higher than the classification without using context information (negation, intensifier and objective words). Overall accuracy of the proposed system is 92% and weighted average F-measure for imbalance class of 1200 reviews is 0.93.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128473760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Prosodic Realization of Information Structure for Chinese Discourse Production by Depression Patients 抑郁症患者汉语话语生产信息结构的韵律实现
Bin Li, Yuan Jia
In this paper, we present a study on the phonetic realization of information structure for production of Chinese reading text by depression patients in comparison with normal people. 16 depression patients and 4 normal people were analyzed in this paper. RefLex was selected as the annotation scheme, which differentiates information structure on the lexical and referential levels. Duration and pitch range were selected as the main phonetic parameters. The main findings are as follows. Depression patients can distinguish new, given and accessible information through duration and pitch range. When the degree of information activation increases, patients tend to expand the duration and pitch range on both levels. Further, the phonetic distinction between depression patients and normal people mainly emerges in term of pitch range.
本文研究了抑郁症患者与正常人对汉语阅读文本信息结构的语音实现情况。本文对16例抑郁症患者和4例正常人进行了分析。选择反射作为标注方案,在词汇和参考层面区分信息结构。选择音长和音高范围作为主要语音参数。主要研究结果如下:抑郁症患者可以通过持续时间和音高范围来区分新的、给定的和可获得的信息。当信息激活程度增加时,患者在两个水平上都倾向于扩大持续时间和音高范围。此外,抑郁症患者与正常人的语音差异主要表现在音域上。
{"title":"The Prosodic Realization of Information Structure for Chinese Discourse Production by Depression Patients","authors":"Bin Li, Yuan Jia","doi":"10.1109/O-COCOSDA50338.2020.9295022","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295022","url":null,"abstract":"In this paper, we present a study on the phonetic realization of information structure for production of Chinese reading text by depression patients in comparison with normal people. 16 depression patients and 4 normal people were analyzed in this paper. RefLex was selected as the annotation scheme, which differentiates information structure on the lexical and referential levels. Duration and pitch range were selected as the main phonetic parameters. The main findings are as follows. Depression patients can distinguish new, given and accessible information through duration and pitch range. When the degree of information activation increases, patients tend to expand the duration and pitch range on both levels. Further, the phonetic distinction between depression patients and normal people mainly emerges in term of pitch range.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130649479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Myanmar Dialogue Act Recognition Using Bi-LSTM RNN 基于Bi-LSTM RNN的缅甸对话行为识别
Sann Su Su Yee, K. Soe
Spoken language understanding (SLU) is an essential element of any dialogue system to understand the language where dialogue act (DA) recognition is also critical aspects of pre-processing step for speech understanding and dialogue system. This paper proposes a deep learning-based DA model which use a deep recurrent neural network (RNN) with bi-directional long short-term memory (Bi-LSTM). The model mainly consists of a word-encode layer, a Bi-LSTM layer, and a softmax layer. For corpus preparation, we collected and annotated a large dialog act annotation corpus, which is called MmTravel (Myanmar Travel) corpus, on travel domain human-human conversations dataset (consists of 80k utterances). This paper reports analysis and comparison of proposed model Bi-LSTM with RNN, LSTM, and baseline SVM model. Experiments on the dataset is shown that our proposed DA model performs better than our previous work, support vector machine (SVM) models, which achieve an improvement of more than 2% accuracy increase in classification on the dataset.
口语理解(SLU)是任何对话系统理解语言的基本要素,其中对话行为(DA)识别也是语音理解和对话系统预处理的关键环节。本文提出了一种基于深度学习的数据分析模型,该模型采用具有双向长短期记忆的深度递归神经网络(RNN)。该模型主要由单词编码层、Bi-LSTM层和softmax层组成。在语料库准备方面,我们收集并标注了一个大型对话行为标注语料库,该语料库称为MmTravel (Myanmar Travel)语料库,该语料库位于旅游领域的人类对话数据集(由80k个话语组成)上。本文对所提出的Bi-LSTM模型与RNN、LSTM和基线SVM模型进行了分析和比较。在数据集上的实验表明,我们提出的DA模型比我们之前的工作,支持向量机(SVM)模型表现更好,在数据集上的分类精度提高了2%以上。
{"title":"Myanmar Dialogue Act Recognition Using Bi-LSTM RNN","authors":"Sann Su Su Yee, K. Soe","doi":"10.1109/O-COCOSDA50338.2020.9295014","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295014","url":null,"abstract":"Spoken language understanding (SLU) is an essential element of any dialogue system to understand the language where dialogue act (DA) recognition is also critical aspects of pre-processing step for speech understanding and dialogue system. This paper proposes a deep learning-based DA model which use a deep recurrent neural network (RNN) with bi-directional long short-term memory (Bi-LSTM). The model mainly consists of a word-encode layer, a Bi-LSTM layer, and a softmax layer. For corpus preparation, we collected and annotated a large dialog act annotation corpus, which is called MmTravel (Myanmar Travel) corpus, on travel domain human-human conversations dataset (consists of 80k utterances). This paper reports analysis and comparison of proposed model Bi-LSTM with RNN, LSTM, and baseline SVM model. Experiments on the dataset is shown that our proposed DA model performs better than our previous work, support vector machine (SVM) models, which achieve an improvement of more than 2% accuracy increase in classification on the dataset.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125187313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The LDC-IL Speech Corpora LDC-IL语音语料库
N. Choudhary, D. G. Rao
This paper introduces the first set of speech corpora released in 2019 by the Linguistic Data Consortium for Indian Languages (LDC-IL), a scheme under the Department of Higher Education, Ministry of Human Resource Development, Government of India. The datasets include a total of 13 scheduled languages of India, collected in various environments across length and breadth of the vast country, from a total of 5662 speakers of different age-groups with a total size of more than 1552 hours. The dataset is still growing as we prune them and make them ready for release. Unique language corpus is usually the largest available at present for these languages. Established in 2008, on the lines of the LDC of University of Pennsylvania, the LDC-IL has worked for over 10 years on various types language resources, including building the speech corpora. LDC-IL is a fully government funded project implemented by CIIL, Mysuru. Due to some restraints in the government business such as cost analysis and copyright issues, it took rather a long time to release the LDC-IL dataset for the public use. This paper gives a brief of the raw speech corpora now released and ready for public use (both commercial and non-commercial purposes). It also discusses how the two major bottlenecks of copyright and costing was addressed which held up the release of these datasets for several years.
本文介绍了印度语言语言数据联盟(LDC-IL)于2019年发布的第一套语音语料库,该联盟是印度政府人力资源发展部高等教育部下属的一个计划。这些数据集包括印度总共13种预定语言,收集于这个幅员辽阔的国家的不同环境中,来自不同年龄组的5662名使用者,总时长超过1552小时。数据集仍在增长,因为我们修剪它们并准备发布。独特的语言语料库通常是这些语言目前最大的可用语料库。LDC- il成立于2008年,以美国宾夕法尼亚大学LDC为蓝本,十多年来一直致力于各类语言资源的研究,包括建立语音语料库。LDC-IL是一个完全由政府资助的项目,由迈苏尔CIIL公司实施。由于成本分析和版权问题等政府业务的限制,LDC-IL数据集的公开使用花费了相当长的时间。本文简要介绍了目前已发布并可供公众使用的原始语音语料库(包括商业和非商业用途)。它还讨论了如何解决版权和成本的两个主要瓶颈,这两个瓶颈阻碍了这些数据集的发布数年。
{"title":"The LDC-IL Speech Corpora","authors":"N. Choudhary, D. G. Rao","doi":"10.1109/O-COCOSDA50338.2020.9295011","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295011","url":null,"abstract":"This paper introduces the first set of speech corpora released in 2019 by the Linguistic Data Consortium for Indian Languages (LDC-IL), a scheme under the Department of Higher Education, Ministry of Human Resource Development, Government of India. The datasets include a total of 13 scheduled languages of India, collected in various environments across length and breadth of the vast country, from a total of 5662 speakers of different age-groups with a total size of more than 1552 hours. The dataset is still growing as we prune them and make them ready for release. Unique language corpus is usually the largest available at present for these languages. Established in 2008, on the lines of the LDC of University of Pennsylvania, the LDC-IL has worked for over 10 years on various types language resources, including building the speech corpora. LDC-IL is a fully government funded project implemented by CIIL, Mysuru. Due to some restraints in the government business such as cost analysis and copyright issues, it took rather a long time to release the LDC-IL dataset for the public use. This paper gives a brief of the raw speech corpora now released and ready for public use (both commercial and non-commercial purposes). It also discusses how the two major bottlenecks of copyright and costing was addressed which held up the release of these datasets for several years.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117141933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Phrase-Based Named Entity Transliteration on Myanmar-English Terminology Dictionary 基于短语的缅英术语词典命名实体音译
A. M. Mon, K. Soe
Named entity (NE) transliteration is mainly a phonetically based transcription of names across languages using different writing systems. For the Myanmar language, robust transliteration of named entities is still a challenging task, because of the complex writing system and the lack of data. The Myanmar NE transliteration dictionary has so far developed over 135,255 NE instance pairs of western person, organization and place names. We apply statistical experiments on Phrase-based statistical machine translation (PBSMT) model using 2-Grams, 3-Grams, 4-Grams, 5-Grams and 6-Grams language models in decoding. Different units in the Myanmar script, i.e., characters and syllables are compared. We perform experiments on 1,000 test data set and 1,000 development data set of our proposed dictionary and measure the performance of our system applying bilingual evaluation understudy (BLEU) score. We discuss detailed observations of our experiments in this paper. According to the evaluations, we got the significant results on syllable unit for Myanmar (Myan) to English (Eng) transliteration direction with 89.3% BLEU score and on character unit for English (Eng) to Myanmar (Myan) transliteration direction with 82.0% BLEU score.
命名实体(NE)音译主要是基于语音的跨语言使用不同的书写系统的名称转录。对于缅甸语来说,由于复杂的书写系统和缺乏数据,命名实体的可靠音译仍然是一项具有挑战性的任务。缅甸东北语音译词典至今已收录了135255个东北语实例对的西方人、组织和地名。本文对基于短语的统计机器翻译(PBSMT)模型进行了统计实验,分别使用2- g、3- g、4- g、5- g和6- g语言模型进行解码。对缅甸文的不同单位,即汉字和音节进行了比较。我们对所提出的词典的1000个测试数据集和1000个开发数据集进行了实验,并使用双语评价替补(BLEU)分数来衡量系统的性能。我们在本文中讨论了我们实验的详细观察结果。根据评价结果,缅语(Myan)到英语(Eng)音译方向的音节单位BLEU得分为89.3%,英语(Eng)到缅语(Myan)音译方向的字符单位BLEU得分为82.0%。
{"title":"Phrase-Based Named Entity Transliteration on Myanmar-English Terminology Dictionary","authors":"A. M. Mon, K. Soe","doi":"10.1109/O-COCOSDA50338.2020.9295015","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295015","url":null,"abstract":"Named entity (NE) transliteration is mainly a phonetically based transcription of names across languages using different writing systems. For the Myanmar language, robust transliteration of named entities is still a challenging task, because of the complex writing system and the lack of data. The Myanmar NE transliteration dictionary has so far developed over 135,255 NE instance pairs of western person, organization and place names. We apply statistical experiments on Phrase-based statistical machine translation (PBSMT) model using 2-Grams, 3-Grams, 4-Grams, 5-Grams and 6-Grams language models in decoding. Different units in the Myanmar script, i.e., characters and syllables are compared. We perform experiments on 1,000 test data set and 1,000 development data set of our proposed dictionary and measure the performance of our system applying bilingual evaluation understudy (BLEU) score. We discuss detailed observations of our experiments in this paper. According to the evaluations, we got the significant results on syllable unit for Myanmar (Myan) to English (Eng) transliteration direction with 89.3% BLEU score and on character unit for English (Eng) to Myanmar (Myan) transliteration direction with 82.0% BLEU score.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121952248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formosa Speech Recognition Challenge 2020 and Taiwanese Across Taiwan Corpus 台湾语音识别挑战赛2020与台湾语料库中的台湾语
Y. Liao, Chia-Yu Chang, Hak-Khiam Tiun, Huang-Lan Su, Hui-Lu Khoo, Jane S. Tsay, Le-Kun Tan, Peter Kang, Tsun-guan Thiann, Un-Gian Iunn, Jyh-Her Yang, Chih-Neng Liang
Taiwanese (a.k.a. Taiwanese Hokkien, Hoklo, Taigi, Southern Min or Min-Nan) is an endangered language, because the domination of Mandarin, the number of Taiwanese speakers continues to drop, especially among the youth generations. In addressing this problem, a Taiwanese speech-enabled human-computer interface for supporting people's daily life is essential. Therefore, a Formosa Speech in the Wild (FSW) project was established to collect a large-scale Taiwanese speech across Taiwan (TAT) corpus to boost the development of Taiwanese speech recognition (TSR). A Formosa Speech Recognition Challenge 2020 (FSR-2020) was also hosted to promote the corpus as well as to evaluate the performance of state-of-the-art TSR systems. This paper briefly introduces TAT corpus and FSR-2020 challenge, presents the provided data profile, evaluation plan and reports experimental baseline results. A subset of TAT corpus, TAT-Vol1, is given away for free for all participants (non-commercial license), and its corresponding Kaldi baseline recipes have been published online. Experimental results have showed that the combination of TAT corpus and the baseline recipes is a good resource pack for TSR research and development.
台湾语(又名台湾闽南语、闽南语、台语、闽南语或闽南语)是一种濒临灭绝的语言,因为普通话的统治,讲台湾语的人数持续下降,尤其是在年轻一代中。为了解决这个问题,一个支持人们日常生活的台湾语音人机界面是必不可少的。因此,本研究建立台塑野外语音(FSW)计画,以搜集大型台文语音(TAT)语料库,推动台文语音识别(TSR)的发展。此外,还举办了Formosa语音识别挑战赛2020 (FSR-2020),以推广语料库并评估最先进的TSR系统的性能。本文简要介绍了TAT语料库和FSR-2020挑战,介绍了提供的数据概况、评估计划,并报告了实验基线结果。TAT语料库的子集TAT- vol1免费提供给所有参与者(非商业许可),其相应的Kaldi基线配方已在线发布。实验结果表明,TAT语料库与基线配方的结合为TSR研究和开发提供了良好的资源包。
{"title":"Formosa Speech Recognition Challenge 2020 and Taiwanese Across Taiwan Corpus","authors":"Y. Liao, Chia-Yu Chang, Hak-Khiam Tiun, Huang-Lan Su, Hui-Lu Khoo, Jane S. Tsay, Le-Kun Tan, Peter Kang, Tsun-guan Thiann, Un-Gian Iunn, Jyh-Her Yang, Chih-Neng Liang","doi":"10.1109/O-COCOSDA50338.2020.9295019","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295019","url":null,"abstract":"Taiwanese (a.k.a. Taiwanese Hokkien, Hoklo, Taigi, Southern Min or Min-Nan) is an endangered language, because the domination of Mandarin, the number of Taiwanese speakers continues to drop, especially among the youth generations. In addressing this problem, a Taiwanese speech-enabled human-computer interface for supporting people's daily life is essential. Therefore, a Formosa Speech in the Wild (FSW) project was established to collect a large-scale Taiwanese speech across Taiwan (TAT) corpus to boost the development of Taiwanese speech recognition (TSR). A Formosa Speech Recognition Challenge 2020 (FSR-2020) was also hosted to promote the corpus as well as to evaluate the performance of state-of-the-art TSR systems. This paper briefly introduces TAT corpus and FSR-2020 challenge, presents the provided data profile, evaluation plan and reports experimental baseline results. A subset of TAT corpus, TAT-Vol1, is given away for free for all participants (non-commercial license), and its corresponding Kaldi baseline recipes have been published online. Experimental results have showed that the combination of TAT corpus and the baseline recipes is a good resource pack for TSR research and development.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122012750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Spoken Language Identification of Four Tibeto-Burman languages 四种藏缅语的口语识别
Joyshree Chakraborty, Priyankoo Sarmah, K. Samudravijaya
Bodo, Dimasa, Rabha and Tiwa are languages of the Tibeto-Burman language family. These languages are spoken in north-east India and surrounding areas. Bodo is also one of the 22 official languages of the Government of India. Consequently, spoken language systems had been developed for Bodo. In contrast, similar systems for the other languages are yet to be developed. Here, we present the details of an automatic Language Identification (LID) system that identifies the language of an input speech file without using phonetic information. The text-independent LID system was implemented using Gaussian mixture model with Mel-Frequency Cepstral Coefficients (MFCCs) as features. A 3-fold cross validation methodology was adopted to assess the performance of the system. The accuracy of the LID system was the highest when suprasegmental features were used in addition to segmental features. The best LID system, using a 62-dimensional feature vector consisting of 13 MFCCs and 49 shifted delta coefficients, yields 92.7% accuracy when the duration of the test data is 3 seconds.
Bodo、Dimasa、Rabha和Tiwa是藏缅语系的语言。这些语言在印度东北部和周边地区使用。博多语也是印度政府22种官方语言之一。因此,为博多语开发了口语系统。相比之下,其他语言的类似系统还有待开发。这里,我们介绍了一个自动语言识别(LID)系统的细节,该系统可以在不使用语音信息的情况下识别输入语音文件的语言。采用以Mel-Frequency倒谱系数(MFCCs)为特征的高斯混合模型实现了与文本无关的LID系统。采用三重交叉验证方法评估系统的性能。除节段特征外,还使用超节段特征时,LID系统的精度最高。当测试数据持续时间为3秒时,使用由13个mfc和49个移位δ系数组成的62维特征向量的最佳LID系统的准确率为92.7%。
{"title":"Spoken Language Identification of Four Tibeto-Burman languages","authors":"Joyshree Chakraborty, Priyankoo Sarmah, K. Samudravijaya","doi":"10.1109/O-COCOSDA50338.2020.9295008","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295008","url":null,"abstract":"Bodo, Dimasa, Rabha and Tiwa are languages of the Tibeto-Burman language family. These languages are spoken in north-east India and surrounding areas. Bodo is also one of the 22 official languages of the Government of India. Consequently, spoken language systems had been developed for Bodo. In contrast, similar systems for the other languages are yet to be developed. Here, we present the details of an automatic Language Identification (LID) system that identifies the language of an input speech file without using phonetic information. The text-independent LID system was implemented using Gaussian mixture model with Mel-Frequency Cepstral Coefficients (MFCCs) as features. A 3-fold cross validation methodology was adopted to assess the performance of the system. The accuracy of the LID system was the highest when suprasegmental features were used in addition to segmental features. The best LID system, using a 62-dimensional feature vector consisting of 13 MFCCs and 49 shifted delta coefficients, yields 92.7% accuracy when the duration of the test data is 3 seconds.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132175318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Word Alignment System for Myanmar-English Machine Translation 改进的缅英机器翻译词对齐系统
Nway Nway Han, A. Thida
Word alignment is an essential task for every Statistical Machine Translation (SMT) system. An alignment is the arrangement of two or more alignments between the parallel sentences. The problem of word alignment in SMT is to find the strong alignment in the corresponding sentence pairs. Moreover, popular word alignment system (GIZA++) needs improvement in Myanmar-English machine translation because Myanmar is inflected, and it is also a language scarce resource. For this reason, this paper presents the idea of word alignment system by adding the extra resources: word and Name Entity Recognition (NER) translation pairs to the existing training data to improve the word alignment system. Experimental results show that the proposed word alignment system reduces the Alignment Error Rate (AER) than baseline.
单词对齐是每个统计机器翻译(SMT)系统的基本任务。对齐是平行句之间的两个或多个对齐的排列。SMT中的词对齐问题是在相应的句子对中找到强对齐。此外,由于缅甸语是屈折的,在缅甸语-英语机器翻译中,流行词对齐系统(giza++)需要改进,并且它也是一种语言稀缺资源。为此,本文提出了词对齐系统的思想,通过在现有的训练数据中增加额外的资源:词和名称实体识别(NER)翻译对来改进词对齐系统。实验结果表明,所提出的词对齐系统比基线的对齐错误率(AER)有所降低。
{"title":"Improved Word Alignment System for Myanmar-English Machine Translation","authors":"Nway Nway Han, A. Thida","doi":"10.1109/O-COCOSDA50338.2020.9295043","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295043","url":null,"abstract":"Word alignment is an essential task for every Statistical Machine Translation (SMT) system. An alignment is the arrangement of two or more alignments between the parallel sentences. The problem of word alignment in SMT is to find the strong alignment in the corresponding sentence pairs. Moreover, popular word alignment system (GIZA++) needs improvement in Myanmar-English machine translation because Myanmar is inflected, and it is also a language scarce resource. For this reason, this paper presents the idea of word alignment system by adding the extra resources: word and Name Entity Recognition (NER) translation pairs to the existing training data to improve the word alignment system. Experimental results show that the proposed word alignment system reduces the Alignment Error Rate (AER) than baseline.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132451135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2020 Philippine Country Report 2020年菲律宾国家报告
Nathaniel Oco, Sambal Botolan, •. L. Archives, Philippine Hokkien
This article consists only of a collection of slides from the author's conference presentation.
本文仅由作者在会议上发表的一些幻灯片组成。
{"title":"2020 Philippine Country Report","authors":"Nathaniel Oco, Sambal Botolan, •. L. Archives, Philippine Hokkien","doi":"10.1109/o-cocosda50338.2020.9294997","DOIUrl":"https://doi.org/10.1109/o-cocosda50338.2020.9294997","url":null,"abstract":"This article consists only of a collection of slides from the author's conference presentation.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127123502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
O-COCOSDA 2020 Tthailand Report November 2020 O-COCOSDA 2020泰国报告2020年11月
{"title":"O-COCOSDA 2020 Tthailand Report November 2020","authors":"","doi":"10.1109/O-COCOSDA50338.2020.9295037","DOIUrl":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295037","url":null,"abstract":"","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125619886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1