{"title":"词性标注的弱监督神经网络","authors":"S. Chopra, S. Bangalore","doi":"10.1109/ICASSP.2012.6288291","DOIUrl":null,"url":null,"abstract":"We introduce a simple and novel method for the weakly supervised problem of Part-Of-Speech tagging with a dictionary. Our method involves training a connectionist network that simultaneously learns a distributed latent representation of the words, while maximizing the tagging accuracy. To compensate for the unavailability of true labels, we resort to training the model using a Curriculum: instead of random order, the model is trained using an ordered sequence of training samples, proceeding from “easier” to “harder” samples. On a standard test corpus, we show that without using any grammatical information, our model is able to outperform the standard EM algorithm in tagging accuracy, and its performance is comparable to other state-of-the-art models. We also show that curriculum learning for this setting significantly improves performance, both in terms of speed of convergence and in terms of generalization.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"1965-1968"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Weakly supervised neural networks for Part-Of-Speech tagging\",\"authors\":\"S. Chopra, S. Bangalore\",\"doi\":\"10.1109/ICASSP.2012.6288291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a simple and novel method for the weakly supervised problem of Part-Of-Speech tagging with a dictionary. Our method involves training a connectionist network that simultaneously learns a distributed latent representation of the words, while maximizing the tagging accuracy. To compensate for the unavailability of true labels, we resort to training the model using a Curriculum: instead of random order, the model is trained using an ordered sequence of training samples, proceeding from “easier” to “harder” samples. On a standard test corpus, we show that without using any grammatical information, our model is able to outperform the standard EM algorithm in tagging accuracy, and its performance is comparable to other state-of-the-art models. We also show that curriculum learning for this setting significantly improves performance, both in terms of speed of convergence and in terms of generalization.\",\"PeriodicalId\":6443,\"journal\":{\"name\":\"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"1 1\",\"pages\":\"1965-1968\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2012.6288291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2012.6288291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Weakly supervised neural networks for Part-Of-Speech tagging
We introduce a simple and novel method for the weakly supervised problem of Part-Of-Speech tagging with a dictionary. Our method involves training a connectionist network that simultaneously learns a distributed latent representation of the words, while maximizing the tagging accuracy. To compensate for the unavailability of true labels, we resort to training the model using a Curriculum: instead of random order, the model is trained using an ordered sequence of training samples, proceeding from “easier” to “harder” samples. On a standard test corpus, we show that without using any grammatical information, our model is able to outperform the standard EM algorithm in tagging accuracy, and its performance is comparable to other state-of-the-art models. We also show that curriculum learning for this setting significantly improves performance, both in terms of speed of convergence and in terms of generalization.