{"title":"语义推理的词义消歧","authors":"Xinda Wang, Xuri Tang, Weiguang Qu, Min Gu","doi":"10.1109/BESC.2017.8256391","DOIUrl":null,"url":null,"abstract":"This paper proposes an algorithm for unsupervised Word Sense Disambiguation to bypass the knowledge bottleneck faced by supervised approaches. By simulating the semantic inference process performed by human language users, the algorithm makes use of a thesaurus to obtain potential substitute words for the target word in a sentence, builds substitute constructs by replacing the target word with substitute words, uses large-scale dependency parsed corpora to calculate the likelihood of the substitute constructs, and then obtain the best substitute word which help specify the sense of the target word in the sentence. Experiments with WordNet 2.1 and the corpora English Gigawords on the lexical sample task in SemEval-2007 show that the algorithm achieves the-state-of-art accuracy for both nouns and verbs, which are 3–5 percent higher than the best unsupervised system in SemEval-2007, given the condition that the knowledge source provides sufficient information.","PeriodicalId":142098,"journal":{"name":"2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Word sense disambiguation by semantic inference\",\"authors\":\"Xinda Wang, Xuri Tang, Weiguang Qu, Min Gu\",\"doi\":\"10.1109/BESC.2017.8256391\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes an algorithm for unsupervised Word Sense Disambiguation to bypass the knowledge bottleneck faced by supervised approaches. By simulating the semantic inference process performed by human language users, the algorithm makes use of a thesaurus to obtain potential substitute words for the target word in a sentence, builds substitute constructs by replacing the target word with substitute words, uses large-scale dependency parsed corpora to calculate the likelihood of the substitute constructs, and then obtain the best substitute word which help specify the sense of the target word in the sentence. Experiments with WordNet 2.1 and the corpora English Gigawords on the lexical sample task in SemEval-2007 show that the algorithm achieves the-state-of-art accuracy for both nouns and verbs, which are 3–5 percent higher than the best unsupervised system in SemEval-2007, given the condition that the knowledge source provides sufficient information.\",\"PeriodicalId\":142098,\"journal\":{\"name\":\"2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC)\",\"volume\":\"158 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BESC.2017.8256391\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BESC.2017.8256391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper proposes an algorithm for unsupervised Word Sense Disambiguation to bypass the knowledge bottleneck faced by supervised approaches. By simulating the semantic inference process performed by human language users, the algorithm makes use of a thesaurus to obtain potential substitute words for the target word in a sentence, builds substitute constructs by replacing the target word with substitute words, uses large-scale dependency parsed corpora to calculate the likelihood of the substitute constructs, and then obtain the best substitute word which help specify the sense of the target word in the sentence. Experiments with WordNet 2.1 and the corpora English Gigawords on the lexical sample task in SemEval-2007 show that the algorithm achieves the-state-of-art accuracy for both nouns and verbs, which are 3–5 percent higher than the best unsupervised system in SemEval-2007, given the condition that the knowledge source provides sufficient information.