首页 > 最新文献

2019 International Conference on Asian Language Processing (IALP)最新文献

英文 中文
Coarse-to-Fine Document Ranking for Multi-Document Reading Comprehension with Answer-Completion 基于答案补全的多文档阅读理解的从粗到精排序
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037670
Hongyu Liu, Shumin Shi, Heyan Huang
Multi-document machine reading comprehension (MRC) has two characteristics compared with traditional MRC: 1) many documents are irrelevant to the question; 2) the length of the answer is relatively longer. However, in existing models, not only key ranking metrics at different granularity are ignored, but also few current methods can predict the complete answer as they mainly deal with the start and end token of each answer equally. To address these issues, we propose a model that can fuse coarse-to-fine ranking processes based on document chunks to distinguish various documents more effectively. Furthermore, we incorporate an answer-completion strategy to predict complete answers by modifying loss function. The experimental results show that our model for multi-document MRC makes a significant improvement with 7.4% and 13% respectively on Rouge-L and BLEU-4 score, in contrast with the current models on a public Chinese dataset, DuReader.
与传统的多文档机器阅读理解相比,多文档机器阅读理解具有两个特点:1)许多文档与问题无关;2)答案的长度相对较长。然而,在现有的模型中,不仅忽略了不同粒度的关键排序指标,而且目前很少有方法能够预测完整的答案,因为它们主要是平等地处理每个答案的开始和结束标记。为了解决这些问题,我们提出了一个模型,该模型可以融合基于文档块的粗到精排序过程,以更有效地区分各种文档。此外,我们结合了一个答案补全策略,通过修改损失函数来预测完整答案。实验结果表明,与现有中文公共数据集DuReader上的模型相比,我们的多文档MRC模型在Rouge-L和BLEU-4得分上分别取得了7.4%和13%的显著提高。
{"title":"Coarse-to-Fine Document Ranking for Multi-Document Reading Comprehension with Answer-Completion","authors":"Hongyu Liu, Shumin Shi, Heyan Huang","doi":"10.1109/IALP48816.2019.9037670","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037670","url":null,"abstract":"Multi-document machine reading comprehension (MRC) has two characteristics compared with traditional MRC: 1) many documents are irrelevant to the question; 2) the length of the answer is relatively longer. However, in existing models, not only key ranking metrics at different granularity are ignored, but also few current methods can predict the complete answer as they mainly deal with the start and end token of each answer equally. To address these issues, we propose a model that can fuse coarse-to-fine ranking processes based on document chunks to distinguish various documents more effectively. Furthermore, we incorporate an answer-completion strategy to predict complete answers by modifying loss function. The experimental results show that our model for multi-document MRC makes a significant improvement with 7.4% and 13% respectively on Rouge-L and BLEU-4 score, in contrast with the current models on a public Chinese dataset, DuReader.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126162205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CIEA: A Corpus for Chinese Implicit Emotion Analysis 汉语内隐情绪分析语料库
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037667
Dawei Li, Jin Wang, Xuejie Zhang
The traditional cultural euphemism of the Han nationality has profound ideological roots. China has always advocated Confucianism, which has led to the implicit expression of Chinese people’s emotions. There are almost no obvious emotional words in spoken language, which poses a challenge to Chinese sentiment analysis. It is very interesting to exploit a corpus that does not contain emotional words, but instead uses detailed description in text to determine the category of the emotional expressed. In this study, we propose a corpus for Chinese implicit sentiment analysis. To do this, we have crawled millions of microblogs. After data cleaning and processing, we obtained the corpus. Based on this corpus, we introduced conventional models and neural networks for implicit sentiment analysis, and achieve promising results. A comparative experiment with a well-known corpus showed the importance of implicit emotions to emotional classification. This not only shows the usefulness of the proposed corpus for implicit sentiment analysis research, but also provides a baseline for further research on this topic.
汉族传统文化委婉语有着深刻的思想根源。中国一直崇尚儒家思想,这导致了中国人的情感含蓄的表达。口语中几乎没有明显的情感词汇,这给汉语情感分析带来了挑战。开发一个不包含情感词,而是使用文本中的详细描述来确定所表达情感的类别的语料库是非常有趣的。在本研究中,我们提出了一个用于汉语内隐情感分析的语料库。为此,我们抓取了数百万条微博。经过数据清洗和处理,我们得到了语料库。基于该语料库,我们引入了传统模型和神经网络进行隐式情感分析,并取得了令人满意的结果。通过与某知名语料库的对比实验,证明了内隐情绪对情绪分类的重要性。这不仅表明了所提出的语料库对隐式情感分析研究的有用性,而且为该主题的进一步研究提供了基线。
{"title":"CIEA: A Corpus for Chinese Implicit Emotion Analysis","authors":"Dawei Li, Jin Wang, Xuejie Zhang","doi":"10.1109/IALP48816.2019.9037667","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037667","url":null,"abstract":"The traditional cultural euphemism of the Han nationality has profound ideological roots. China has always advocated Confucianism, which has led to the implicit expression of Chinese people’s emotions. There are almost no obvious emotional words in spoken language, which poses a challenge to Chinese sentiment analysis. It is very interesting to exploit a corpus that does not contain emotional words, but instead uses detailed description in text to determine the category of the emotional expressed. In this study, we propose a corpus for Chinese implicit sentiment analysis. To do this, we have crawled millions of microblogs. After data cleaning and processing, we obtained the corpus. Based on this corpus, we introduced conventional models and neural networks for implicit sentiment analysis, and achieve promising results. A comparative experiment with a well-known corpus showed the importance of implicit emotions to emotional classification. This not only shows the usefulness of the proposed corpus for implicit sentiment analysis research, but also provides a baseline for further research on this topic.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125893230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved DNN-HMM English Acoustic Model Specially For Phonotactic Language Recognition 专为语音识别而改进的DNN-HMM英语声学模型
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037696
Weiwei Liu, Ying Yin, Ya-Nan Li, Yu-Bin Huang, Ting Ruan, Wei Liu, Rui-Li Du, Hua Bai, Wei Li, Sheng-Ge Zhang, Guo-Chun Li, Cun-Xue Zhang, Hai-Feng Yan, Jing He, Ying-Xin Gan, Yan-Miao Song, Jianhua Zhou, Jian-zhong Liu
The now-acknowledged sensitive of Phonotactic Language Recognition (PLR) to the performance of the phone recognizer front-end have spawned interests to develop many methods to improve it. In this paper, improved Deep Neural Networks Hidden Markov Model (DNN-HMM) English acoustic model front-end specially for phonotactic language recognition is proposed, and series of methods like dictionary merging, phoneme splitting, phoneme clustering, state clustering and DNN-HMM acoustic modeling (DPPSD) are introduced to balance the generalization and the accusation of the speech tokenizing processing in PLR. Experiments are carried out on the database of National Institute of Standards and Technology language recognition evaluation 2009 (NIST LRE 2009). It is showed that the DPPSD English acoustic model based phonotactic language recognition system yields 2.09%, 6.60%, 19.72% for 30s, 10s, 3s in equal error rate (EER) by applying the state-of-the-art techniques, which outperforms the language recognition results on both TIMIT and CMU dictionary and other phoneme clustering methods.
语音语言识别(PLR)对手机识别器前端性能的敏感性已经得到了广泛的认识,因此人们有兴趣开发许多方法来改进它。本文提出了一种专门用于语音定向语言识别的改进的深度神经网络隐马尔可夫模型(DNN-HMM)英语声学模型前端,并引入字典合并、音素分裂、音素聚类、状态聚类和DNN-HMM声学建模(DPPSD)等一系列方法来平衡PLR中语音分词处理的泛化和指责。实验在美国国家标准技术研究院2009年语言识别评估数据库(NIST LRE 2009)上进行。结果表明,基于DPPSD英语声学模型的语音定向语言识别系统在30秒、10秒、3秒等错误率(EER)下的识别准确率分别为2.09%、6.60%、19.72%,优于TIMIT和CMU词典以及其他音素聚类方法的识别结果。
{"title":"Improved DNN-HMM English Acoustic Model Specially For Phonotactic Language Recognition","authors":"Weiwei Liu, Ying Yin, Ya-Nan Li, Yu-Bin Huang, Ting Ruan, Wei Liu, Rui-Li Du, Hua Bai, Wei Li, Sheng-Ge Zhang, Guo-Chun Li, Cun-Xue Zhang, Hai-Feng Yan, Jing He, Ying-Xin Gan, Yan-Miao Song, Jianhua Zhou, Jian-zhong Liu","doi":"10.1109/IALP48816.2019.9037696","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037696","url":null,"abstract":"The now-acknowledged sensitive of Phonotactic Language Recognition (PLR) to the performance of the phone recognizer front-end have spawned interests to develop many methods to improve it. In this paper, improved Deep Neural Networks Hidden Markov Model (DNN-HMM) English acoustic model front-end specially for phonotactic language recognition is proposed, and series of methods like dictionary merging, phoneme splitting, phoneme clustering, state clustering and DNN-HMM acoustic modeling (DPPSD) are introduced to balance the generalization and the accusation of the speech tokenizing processing in PLR. Experiments are carried out on the database of National Institute of Standards and Technology language recognition evaluation 2009 (NIST LRE 2009). It is showed that the DPPSD English acoustic model based phonotactic language recognition system yields 2.09%, 6.60%, 19.72% for 30s, 10s, 3s in equal error rate (EER) by applying the state-of-the-art techniques, which outperforms the language recognition results on both TIMIT and CMU dictionary and other phoneme clustering methods.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126973808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusion of Image-text attention for Transformer-based Multimodal Machine Translation 基于变压器的多模态机器翻译的图像-文本注意融合
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037732
Junteng Ma, Shihao Qin, Lan Su, xia li, Lixian Xiao
In recent years, multimodal machine translation has become one of the hot research topics. In this paper, a machine translation model based on self-attention mechanism is extended for multimodal machine translation. In the model, an Image-text attention layer is added in the end of encoder layer to capture the relevant semantic information between image and text words. With this layer of attention, the model can capture the different weights between the words that is relevant to the image or appear in the image, and get a better text representation that fuses these weights, so that it can be better used for decoding of the model. Experiments are carried out on the original English-German sentence pairs of the multimodal machine translation dataset, Multi30k, and the Indonesian-Chinese sentence pairs which is manually annotated by human. The results show that our model performs better than the text-only transformer-based machine translation model and is comparable to most of the existing work, proves the effectiveness of our model.
近年来,多模态机器翻译已成为研究的热点之一。本文将基于自注意机制的机器翻译模型扩展到多模态机器翻译中。在该模型中,在编码器层的末尾增加了一个图像-文本注意层,以捕获图像和文本词之间的相关语义信息。通过这一关注层,模型可以捕获与图像相关或出现在图像中的单词之间的不同权重,并得到融合这些权重的更好的文本表示,从而可以更好地用于模型的解码。对多模态机器翻译数据集Multi30k中的英-德原始句子对和人工标注的印尼语-汉语句子对进行了实验。结果表明,该模型的翻译性能优于基于纯文本转换器的机器翻译模型,并且与大多数现有工作相当,证明了该模型的有效性。
{"title":"Fusion of Image-text attention for Transformer-based Multimodal Machine Translation","authors":"Junteng Ma, Shihao Qin, Lan Su, xia li, Lixian Xiao","doi":"10.1109/IALP48816.2019.9037732","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037732","url":null,"abstract":"In recent years, multimodal machine translation has become one of the hot research topics. In this paper, a machine translation model based on self-attention mechanism is extended for multimodal machine translation. In the model, an Image-text attention layer is added in the end of encoder layer to capture the relevant semantic information between image and text words. With this layer of attention, the model can capture the different weights between the words that is relevant to the image or appear in the image, and get a better text representation that fuses these weights, so that it can be better used for decoding of the model. Experiments are carried out on the original English-German sentence pairs of the multimodal machine translation dataset, Multi30k, and the Indonesian-Chinese sentence pairs which is manually annotated by human. The results show that our model performs better than the text-only transformer-based machine translation model and is comparable to most of the existing work, proves the effectiveness of our model.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130687602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Confidence Modeling for Neural Machine Translation 神经机器翻译的置信度建模
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037709
Taichi Aida, Kazuhide Yamamoto
Current methods of neural machine translation output incorrect sentences together with sentences translated correctly. Consequently, users of neural machine translation algorithms do not have a way to check which outputted sentences have been translated correctly without employing an evaluation method. Therefore, we aim to define the confidence values in neural machine translation models. We suppose that setting a threshold to limit the confidence value would allow correctly translated sentences to exceed the threshold; thus, only clearly translated sentences would be outputted. Hence, users of such a translation tool can obtain a particular level of confidence in the translation correctness. We propose some indices; sentence log-likelihood, minimum variance, and average variance. After that, we calculated the correlation between each index and bilingual evaluation score (BLEU) to investigate the appropriateness of the defined confidence indices. As a result, sentence log-likelihood and average variance calculated by probability have a weak correlation with the BLEU score. Furthermore, when we set each index as the threshold value, we could obtain high quality translated sentences instead of outputting all translated sentences which include a wide range of quality sentences like previous work.
目前的神经机器翻译方法输出错误的句子和正确翻译的句子。因此,如果不使用评估方法,神经机器翻译算法的用户就没有办法检查输出的句子是否被正确翻译。因此,我们的目标是定义神经机器翻译模型的置信度值。我们假设设置一个阈值来限制置信值将允许正确翻译的句子超过阈值;因此,只会输出翻译清楚的句子。因此,使用这种翻译工具的用户可以对翻译的正确性获得一定程度的信心。我们提出了一些指标;句子的对数似然、最小方差和平均方差。之后,我们计算了每个指标与双语评价分数(BLEU)之间的相关性,以调查所定义的置信指数的适当性。因此,句子的对数似然和概率计算的平均方差与BLEU得分的相关性较弱。此外,当我们将每个指标设置为阈值时,我们可以获得高质量的翻译句子,而不是像以前的工作那样输出所有的翻译句子,其中包含大量的高质量句子。
{"title":"Confidence Modeling for Neural Machine Translation","authors":"Taichi Aida, Kazuhide Yamamoto","doi":"10.1109/IALP48816.2019.9037709","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037709","url":null,"abstract":"Current methods of neural machine translation output incorrect sentences together with sentences translated correctly. Consequently, users of neural machine translation algorithms do not have a way to check which outputted sentences have been translated correctly without employing an evaluation method. Therefore, we aim to define the confidence values in neural machine translation models. We suppose that setting a threshold to limit the confidence value would allow correctly translated sentences to exceed the threshold; thus, only clearly translated sentences would be outputted. Hence, users of such a translation tool can obtain a particular level of confidence in the translation correctness. We propose some indices; sentence log-likelihood, minimum variance, and average variance. After that, we calculated the correlation between each index and bilingual evaluation score (BLEU) to investigate the appropriateness of the defined confidence indices. As a result, sentence log-likelihood and average variance calculated by probability have a weak correlation with the BLEU score. Furthermore, when we set each index as the threshold value, we could obtain high quality translated sentences instead of outputting all translated sentences which include a wide range of quality sentences like previous work.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130839368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on New Event Detection Methods for Mongolian News 蒙文新闻事件检测新方法研究
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037708
Shijie Wang, F. Bao, Guanglai Gao
New event detection (NED) aims at detecting the first news from one or multiple streams of news stories. This paper is aimed at the field of journalism and studies the related methods of Mongolian new event detection. The paper proposes a method that combines the similarity of news content with the similarity of news elements to detect the new event. For the news content representation, according to the characteristics of the news and the different vocabulary expressions in different news categories, improve the traditional TF-IDF method. In addition, extract the main elements of the news, including time, place, subject, object, denoter, and calculate the similarity of news elements between the two news documents. Finally, the similarity between the news content and the news elements is combined to calculate the final similarity for new event detection. The experimental results show that the improved method is obvious, and the performance is significantly improved compared with the traditional new event detection system.
新事件检测(NED)旨在从一个或多个新闻故事流中检测第一条新闻。本文以新闻领域为研究对象,对蒙古语新事件检测的相关方法进行了研究。本文提出了一种将新闻内容的相似度与新闻元素的相似度相结合来检测新事件的方法。对于新闻内容的表示,根据新闻的特点和不同新闻类别的不同词汇表达,对传统的TF-IDF方法进行改进。此外,提取新闻的主要元素,包括时间、地点、主语、宾语、标注,并计算两个新闻文档之间新闻元素的相似度。最后,结合新闻内容与新闻元素之间的相似度,计算出最终的相似度,用于新事件检测。实验结果表明,改进后的方法效果明显,与传统的新型事件检测系统相比,性能有明显提高。
{"title":"Research on New Event Detection Methods for Mongolian News","authors":"Shijie Wang, F. Bao, Guanglai Gao","doi":"10.1109/IALP48816.2019.9037708","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037708","url":null,"abstract":"New event detection (NED) aims at detecting the first news from one or multiple streams of news stories. This paper is aimed at the field of journalism and studies the related methods of Mongolian new event detection. The paper proposes a method that combines the similarity of news content with the similarity of news elements to detect the new event. For the news content representation, according to the characteristics of the news and the different vocabulary expressions in different news categories, improve the traditional TF-IDF method. In addition, extract the main elements of the news, including time, place, subject, object, denoter, and calculate the similarity of news elements between the two news documents. Finally, the similarity between the news content and the news elements is combined to calculate the final similarity for new event detection. The experimental results show that the improved method is obvious, and the performance is significantly improved compared with the traditional new event detection system.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128036429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Initial Research of Mongolian Literary Corpus-Take the Text of Da.Nachugdorji’s Work for Instance 蒙古族文学语料库初探——以《达》文本为例。比如纳丘多吉的作品
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037660
Yin Hai
Today, the Mongolian corpus is gradually developed from the basic resource construction stage to an in-depth research covering multi-level processing or authorcorpus-based quantitative analysis, and multi-functional electronic dictionary’s development. However, there are still many shortcomings and deficiencies in the collection, development and processing of literary corpus. In this paper, the author will introduces the corpus of Da.Nachugdorji’s Literature and will discusses its profound significance, and fulfill multi-level processing such as lexical, syntactic and semantic annotation, as well as dissertates the preliminary processing research of Mongolian literary corpus from the perspective of statistics on the POS, word and phrase frequency and computation of lexical richness.
如今,蒙古语语料库正逐步从基础资源建设阶段发展到多层次处理或基于作者语料库的定量分析的深入研究,以及多功能电子词典的开发。然而,在文学语料库的收集、开发和处理方面,还存在许多不足和不足。在本文中,作者将介绍语料库。纳楚克多吉的《文学与文学》论述了其深刻意义,完成了词汇、句法、语义标注等多层次处理,并从词性统计、词频统计、词汇丰富度计算等角度论述了蒙文文学语料库的初步处理研究。
{"title":"The Initial Research of Mongolian Literary Corpus-Take the Text of Da.Nachugdorji’s Work for Instance","authors":"Yin Hai","doi":"10.1109/IALP48816.2019.9037660","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037660","url":null,"abstract":"Today, the Mongolian corpus is gradually developed from the basic resource construction stage to an in-depth research covering multi-level processing or authorcorpus-based quantitative analysis, and multi-functional electronic dictionary’s development. However, there are still many shortcomings and deficiencies in the collection, development and processing of literary corpus. In this paper, the author will introduces the corpus of Da.Nachugdorji’s Literature and will discusses its profound significance, and fulfill multi-level processing such as lexical, syntactic and semantic annotation, as well as dissertates the preliminary processing research of Mongolian literary corpus from the perspective of statistics on the POS, word and phrase frequency and computation of lexical richness.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131587543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Articulatory Features Based TDNN Model for Spoken Language Recognition 基于发音特征的TDNN模型用于口语识别
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037566
Jiawei Yu, Minghao Guo, Yanlu Xie, Jinsong Zhang
In order to improve the performance of the Spoken Language Recognition (SLR) system, we propose an acoustic modeling framework in which the Time Delay Neural Network (TDNN) models long term dependencies between Articulatory Features (AFs). Several experiments were conducted on APSIPA 2017 Oriental Language Recognition(AP17-OLR) database. We compared the AFs based TDNN approach to the Deep Bottleneck (DBN) features based ivector and xvector systems, and the proposed approach provide a 23.10% and 12.87% relative improvement in Equal Error Rate (EER). These results indicate that the proposed approach is beneficial to the SLR task.
为了提高口语识别(SLR)系统的性能,我们提出了一个声学建模框架,其中时延神经网络(TDNN)建模发音特征(AFs)之间的长期依赖关系。在APSIPA 2017东方语言识别(AP17-OLR)数据库上进行了多项实验。我们将基于AFs的TDNN方法与基于深度瓶颈(DBN)特征的向量和xvector系统进行了比较,提出的方法在等错误率(EER)方面提供了23.10%和12.87%的相对改进。结果表明,该方法有利于单反任务的实现。
{"title":"Articulatory Features Based TDNN Model for Spoken Language Recognition","authors":"Jiawei Yu, Minghao Guo, Yanlu Xie, Jinsong Zhang","doi":"10.1109/IALP48816.2019.9037566","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037566","url":null,"abstract":"In order to improve the performance of the Spoken Language Recognition (SLR) system, we propose an acoustic modeling framework in which the Time Delay Neural Network (TDNN) models long term dependencies between Articulatory Features (AFs). Several experiments were conducted on APSIPA 2017 Oriental Language Recognition(AP17-OLR) database. We compared the AFs based TDNN approach to the Deep Bottleneck (DBN) features based ivector and xvector systems, and the proposed approach provide a 23.10% and 12.87% relative improvement in Equal Error Rate (EER). These results indicate that the proposed approach is beneficial to the SLR task.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129693784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Multi-stage Strategy for Chinese Discourse Tree Construction 汉语语篇树构建的多阶段策略
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037684
Tishuang Wang, Peifeng Li, Qiaoming Zhu
Building discourse tree is crucial to improve the performance of discourse parsing. There are two issues in previous work on discourse tree construction, i.e., the error accumulation and the influence of connectives in transition-based algorithms. To address above issues, this paper proposes a tensor-based neural network with the multi-stage strategy and connective deletion mechanism. Experimental results on both CDTB and RST-DT show that our model achieves the state-of-the-art performance.
构建语篇树是提高语篇分析性能的关键。在以往的语篇树构建工作中存在两个问题,即基于转换的算法中的错误积累和连接词的影响。为了解决上述问题,本文提出了一种基于张量的神经网络,该网络具有多阶段策略和连接删除机制。在CDTB和RST-DT上的实验结果表明,我们的模型达到了最先进的性能。
{"title":"A Multi-stage Strategy for Chinese Discourse Tree Construction","authors":"Tishuang Wang, Peifeng Li, Qiaoming Zhu","doi":"10.1109/IALP48816.2019.9037684","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037684","url":null,"abstract":"Building discourse tree is crucial to improve the performance of discourse parsing. There are two issues in previous work on discourse tree construction, i.e., the error accumulation and the influence of connectives in transition-based algorithms. To address above issues, this paper proposes a tensor-based neural network with the multi-stage strategy and connective deletion mechanism. Experimental results on both CDTB and RST-DT show that our model achieves the state-of-the-art performance.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131442859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Meta-evaluation of Low-Resource Machine Translation Evaluation Metrics 低资源机器翻译评价指标的自动元评价
Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037658
Junting Yu, Wuying Liu, Hongye He, Lin Wang
Meta-evaluation is a method to assess machine translation (MT) evaluation metrics according to certain theories and standards. This paper addresses an automatic meta-evaluation method of machine translation evaluation based on ORANGE- Limited ORANGE, which is applied in low-resource machine translation evaluation. It is adopted when the resources are limited. And take the three n-gram-based metrics - BLEUS, ROUGE-L and ROUGE-S for experiment, which is called horizontal comparison. Also, vertical comparison is used to compare the different forms of the same evaluation metric. Compared with the traditional human method, this method can evaluate metrics automatically without extra human involvement except for a set of references. It only needs the average rank of the references, and will not be influenced by the subjective factors. And it costs less and expends less time than the traditional one. It is good for the machine translation system parameter optimization and shortens the system development period. In this paper, we use this automatic meta-evaluation method to evaluate BLEUS, ROUGE-L, ROUGE-S and their different forms based on Cilin on the Russian-Chinese dataset. The result shows the same as that of the traditional human meta-evaluation. In this way, the consistency and effectiveness of Limited ORANGE are verified.
元评价是根据一定的理论和标准对机器翻译评价指标进行评价的一种方法。本文提出了一种基于ORANGE- Limited ORANGE的机器翻译评价自动元评价方法,并将其应用于低资源机器翻译评价中。在资源有限的情况下采用。并以三个基于n-gram的指标——BLEUS、ROUGE-L和ROUGE-S进行实验,称为水平比较。此外,垂直比较用于比较相同评估指标的不同形式。与传统的人工方法相比,该方法可以自动评估指标,除了一组参考外,无需额外的人工参与。它只需要参考文献的平均排名,不会受到主观因素的影响。它比传统的花费更少,花费更少的时间。这有利于机器翻译系统的参数优化,缩短了系统的开发周期。本文采用这种自动元评价方法,在俄中数据集上对基于Cilin的BLEUS、ROUGE-L、ROUGE-S及其不同形式进行了评价。结果与传统的人的元评价结果一致。从而验证了Limited ORANGE的一致性和有效性。
{"title":"Automatic Meta-evaluation of Low-Resource Machine Translation Evaluation Metrics","authors":"Junting Yu, Wuying Liu, Hongye He, Lin Wang","doi":"10.1109/IALP48816.2019.9037658","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037658","url":null,"abstract":"Meta-evaluation is a method to assess machine translation (MT) evaluation metrics according to certain theories and standards. This paper addresses an automatic meta-evaluation method of machine translation evaluation based on ORANGE- Limited ORANGE, which is applied in low-resource machine translation evaluation. It is adopted when the resources are limited. And take the three n-gram-based metrics - BLEUS, ROUGE-L and ROUGE-S for experiment, which is called horizontal comparison. Also, vertical comparison is used to compare the different forms of the same evaluation metric. Compared with the traditional human method, this method can evaluate metrics automatically without extra human involvement except for a set of references. It only needs the average rank of the references, and will not be influenced by the subjective factors. And it costs less and expends less time than the traditional one. It is good for the machine translation system parameter optimization and shortens the system development period. In this paper, we use this automatic meta-evaluation method to evaluate BLEUS, ROUGE-L, ROUGE-S and their different forms based on Cilin on the Russian-Chinese dataset. The result shows the same as that of the traditional human meta-evaluation. In this way, the consistency and effectiveness of Limited ORANGE are verified.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123026184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2019 International Conference on Asian Language Processing (IALP)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1