跨语言投射的半监督学习框架

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Pub Date : 2011-08-22 DOI:10.1109/WI-IAT.2011.58

PengLong Hu, Mo Yu, Jing Li, Conghui Zhu, T. Zhao

{"title":"跨语言投射的半监督学习框架","authors":"PengLong Hu, Mo Yu, Jing Li, Conghui Zhu, T. Zhao","doi":"10.1109/WI-IAT.2011.58","DOIUrl":null,"url":null,"abstract":"Cross-lingual projection encounters two major challenges, the noise from word-alignment error and the syntactic divergences between two languages. To solve these two problems, a semi-supervised learning framework of cross-lingual projection is proposed to get better annotations using parallel data. Moreover, a projection model is introduced to model the projection process of labeling from the resource-rich language to the resource-scarce language. The projection model, together with the traditional target model of cross-lingual projection, can be seen as two views of parallel data. Utilizing these two views, an extension of co-training algorithm to structured predictions is designed to boost the result of the two models. Experiments show that the proposed cross-lingual projection method improves the accuracy in the task of POS-tagging projection. And using only one-to-one alignments proves to lead to more accurate results than using all kinds of alignment information.","PeriodicalId":128421,"journal":{"name":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Semi-supervised Learning Framework for Cross-Lingual Projection\",\"authors\":\"PengLong Hu, Mo Yu, Jing Li, Conghui Zhu, T. Zhao\",\"doi\":\"10.1109/WI-IAT.2011.58\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-lingual projection encounters two major challenges, the noise from word-alignment error and the syntactic divergences between two languages. To solve these two problems, a semi-supervised learning framework of cross-lingual projection is proposed to get better annotations using parallel data. Moreover, a projection model is introduced to model the projection process of labeling from the resource-rich language to the resource-scarce language. The projection model, together with the traditional target model of cross-lingual projection, can be seen as two views of parallel data. Utilizing these two views, an extension of co-training algorithm to structured predictions is designed to boost the result of the two models. Experiments show that the proposed cross-lingual projection method improves the accuracy in the task of POS-tagging projection. And using only one-to-one alignments proves to lead to more accurate results than using all kinds of alignment information.\",\"PeriodicalId\":128421,\"journal\":{\"name\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IAT.2011.58\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IAT.2011.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

跨语言投影面临两大挑战，一是词对误差带来的噪声，二是两种语言之间的句法差异。为了解决这两个问题，提出了一种跨语言投影的半监督学习框架，利用并行数据获得更好的标注。在此基础上，引入投影模型对标注从资源丰富语言到资源稀缺语言的投影过程进行建模。投影模型与传统的跨语言投影目标模型可以看作是并行数据的两种视图。利用这两种观点，设计了一种将协同训练算法扩展到结构化预测的方法，以提高这两种模型的结果。实验表明，本文提出的跨语言投影方法提高了pos标注投影任务的准确率。事实证明，仅使用一对一比对比对比使用各种比对比对信息得到的结果更准确。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semi-supervised Learning Framework for Cross-Lingual Projection

Cross-lingual projection encounters two major challenges, the noise from word-alignment error and the syntactic divergences between two languages. To solve these two problems, a semi-supervised learning framework of cross-lingual projection is proposed to get better annotations using parallel data. Moreover, a projection model is introduced to model the projection process of labeling from the resource-rich language to the resource-scarce language. The projection model, together with the traditional target model of cross-lingual projection, can be seen as two views of parallel data. Utilizing these two views, an extension of co-training algorithm to structured predictions is designed to boost the result of the two models. Experiments show that the proposed cross-lingual projection method improves the accuracy in the task of POS-tagging projection. And using only one-to-one alignments proves to lead to more accurate results than using all kinds of alignment information.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology

自引率

0.00%

发文量

期刊最新文献

Slovak Blog Clustering Enhanced by Mining the Web Comments Automatic Face Annotation in News Images by Mining the Web Exploiting Additional Dimensions as Virtual Items on Top-N Recommender Systems Supporting Agent Systems in the Programming Language A Software Agent Framework for Exploiting Demand-Side Consumer Social Networks in Power Systems