{"title":"Word Sense Disambiguation Using Multiple Contextual Features","authors":"Liang-Chih Yu, Chung-Hsien Wu, Jui-Feng Yeh","doi":"10.30019/IJCLCLP.201009.0002","DOIUrl":null,"url":null,"abstract":"Word sense disambiguation (WSD) is a technique used to identify the correct sense of polysemous words, and it is useful for many applications, such as machine translation (MT), lexical substitution, information retrieval (IR), and biomedical applications. In this paper, we propose the use of multiple contextual features, including the predicate-argument structure and named entities, to train two commonly used classifiers, Naive Bayes (NB) and Maximum Entropy (ME), for word sense disambiguation. Experiments are conducted to evaluate the classifiers' performance on the OntoNotes corpus and are compared with classifiers trained using a set of baseline features, such as the bag-of-words, n-grams, and part-of-speech (POS) tags. Experimental results show that incorporating both predicate-argument structure and named entities yields higher classification accuracy for both classifiers than does the use of the baseline features, resulting in accuracy as high as 81.6% and 87.4%, respectively, for NB and ME.","PeriodicalId":436300,"journal":{"name":"Int. J. Comput. Linguistics Chin. Lang. Process.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Linguistics Chin. Lang. Process.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30019/IJCLCLP.201009.0002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Word sense disambiguation (WSD) is a technique used to identify the correct sense of polysemous words, and it is useful for many applications, such as machine translation (MT), lexical substitution, information retrieval (IR), and biomedical applications. In this paper, we propose the use of multiple contextual features, including the predicate-argument structure and named entities, to train two commonly used classifiers, Naive Bayes (NB) and Maximum Entropy (ME), for word sense disambiguation. Experiments are conducted to evaluate the classifiers' performance on the OntoNotes corpus and are compared with classifiers trained using a set of baseline features, such as the bag-of-words, n-grams, and part-of-speech (POS) tags. Experimental results show that incorporating both predicate-argument structure and named entities yields higher classification accuracy for both classifiers than does the use of the baseline features, resulting in accuracy as high as 81.6% and 87.4%, respectively, for NB and ME.