低资源黏着语形态标注的联合学习模型

Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI:10.18653/v1/2023.sigmorphon-1.4

Gulinigeer Abudouwaili, Kahaerjiang Abiderexiti, Nian Yi, Aishan Wumaier

{"title":"低资源黏着语形态标注的联合学习模型","authors":"Gulinigeer Abudouwaili, Kahaerjiang Abiderexiti, Nian Yi, Aishan Wumaier","doi":"10.18653/v1/2023.sigmorphon-1.4","DOIUrl":null,"url":null,"abstract":"Due to the lack of data resources, rule-based or transfer learning is mainly used in the morphological tagging of low-resource languages. However, these methods require expert knowledge, ignore contextual features, and have error propagation. Therefore, we propose a joint morphological tagger for low-resource agglutinative languages to alleviate the above challenges. First, we represent the contextual input with multi-dimensional features of agglutinative words. Second, joint training reduces the direct impact of part-of-speech errors on morphological features and increases the indirect influence between the two types of labels through a fusion mechanism. Finally, our model separately predicts part-of-speech and morphological features. Part-of-speech tagging is regarded as sequence tagging. When predicting morphological features, two-label adjacency graphs are dynamically reconstructed by integrating multilingual global features and monolingual local features. Then, a graph convolution network is used to learn the higher-order intersection of labels. A series of experiments show that the proposed model in this paper is superior to other comparative models.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Joint Learning Model for Low-Resource Agglutinative Language Morphological Tagging\",\"authors\":\"Gulinigeer Abudouwaili, Kahaerjiang Abiderexiti, Nian Yi, Aishan Wumaier\",\"doi\":\"10.18653/v1/2023.sigmorphon-1.4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the lack of data resources, rule-based or transfer learning is mainly used in the morphological tagging of low-resource languages. However, these methods require expert knowledge, ignore contextual features, and have error propagation. Therefore, we propose a joint morphological tagger for low-resource agglutinative languages to alleviate the above challenges. First, we represent the contextual input with multi-dimensional features of agglutinative words. Second, joint training reduces the direct impact of part-of-speech errors on morphological features and increases the indirect influence between the two types of labels through a fusion mechanism. Finally, our model separately predicts part-of-speech and morphological features. Part-of-speech tagging is regarded as sequence tagging. When predicting morphological features, two-label adjacency graphs are dynamically reconstructed by integrating multilingual global features and monolingual local features. Then, a graph convolution network is used to learn the higher-order intersection of labels. A series of experiments show that the proposed model in this paper is superior to other comparative models.\",\"PeriodicalId\":186158,\"journal\":{\"name\":\"Special Interest Group on Computational Morphology and Phonology Workshop\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Special Interest Group on Computational Morphology and Phonology Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2023.sigmorphon-1.4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Special Interest Group on Computational Morphology and Phonology Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2023.sigmorphon-1.4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由于缺乏数据资源，基于规则的学习或迁移学习主要用于低资源语言的形态标注。然而，这些方法需要专业知识，忽略了上下文特征，并且存在错误传播。因此，我们提出了一种针对低资源黏着语言的联合形态标注器来缓解上述挑战。首先，我们用黏着词的多维特征来表示语境输入。其次，联合训练减少词性错误对词形特征的直接影响，通过融合机制增加两类标签之间的间接影响。最后，我们的模型分别预测词性和形态特征。词性标注被认为是序列标注。在预测形态特征时，通过整合多语言全局特征和单语言局部特征，动态重构双标签邻接图。然后，使用图卷积网络学习标签的高阶交集。一系列实验表明，本文提出的模型优于其他比较模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Joint Learning Model for Low-Resource Agglutinative Language Morphological Tagging

Due to the lack of data resources, rule-based or transfer learning is mainly used in the morphological tagging of low-resource languages. However, these methods require expert knowledge, ignore contextual features, and have error propagation. Therefore, we propose a joint morphological tagger for low-resource agglutinative languages to alleviate the above challenges. First, we represent the contextual input with multi-dimensional features of agglutinative words. Second, joint training reduces the direct impact of part-of-speech errors on morphological features and increases the indirect influence between the two types of labels through a fusion mechanism. Finally, our model separately predicts part-of-speech and morphological features. Part-of-speech tagging is regarded as sequence tagging. When predicting morphological features, two-label adjacency graphs are dynamically reconstructed by integrating multilingual global features and monolingual local features. Then, a graph convolution network is used to learn the higher-order intersection of labels. A series of experiments show that the proposed model in this paper is superior to other comparative models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Special Interest Group on Computational Morphology and Phonology Workshop

自引率

0.00%

发文量