Discriminating between Mandarin Chinese and Swiss-German varieties using adaptive language models

Proceedings of the Sixth Workshop on Pub Date : 2019-04-30 DOI:10.18653/v1/W19-1419

T. Jauhiainen, Krister Lindén, H. Jauhiainen

引用次数: 19

Abstract

This paper describes the language identification systems used by the SUKI team in the Discriminating between the Mainland and Taiwan variation of Mandarin Chinese (DMT) and the German Dialect Identification (GDI) shared tasks which were held as part of the third VarDial Evaluation Campaign. The DMT shared task included two separate tracks, one for the simplified Chinese script and one for the traditional Chinese script. We submitted three runs on both tracks of the DMT task as well as on the GDI task. We won the traditional Chinese track using Naive Bayes with language model adaptation, came second on GDI with an adaptive version of the HeLI 2.0 method, and third on the simplified Chinese track using again the adaptive Naive Bayes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

运用自适应语言模型对普通话和瑞士德语进行区分

本文描述了在第三次VarDial评估活动中，SUKI团队在区分普通话大陆和台湾变体(DMT)和德语方言识别(GDI)共享任务中使用的语言识别系统。DMT共享任务包括两个独立的轨道，一个用于简体中文，一个用于繁体中文。我们在DMT任务和GDI任务的两个轨道上提交了三次运行。我们使用具有语言模型自适应的朴素贝叶斯获得了繁体中文赛道的冠军，使用自适应版本的HeLI 2.0方法获得了GDI第二名，使用自适应朴素贝叶斯获得了简体中文赛道的第三名。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊