{"title":"Using Hyperlink Texts to Improve Quality of Identifying Document Topics Based on Wikipedia","authors":"Dat T. Huynh, T. Cao, P. H. Pham, Toan N. Hoang","doi":"10.1109/KSE.2009.20","DOIUrl":null,"url":null,"abstract":"This paper presents a method to identify the topics of documents based on Wikipedia category network. It is to improve the method previously proposed by Schonhofen by taking into account the weights of words in hyperlink texts in Wikipedia articles. The experiments on Computing and Team Sport domains have been carried out and showed that our proposed method outperforms the Schonhofen’s one.","PeriodicalId":347175,"journal":{"name":"2009 International Conference on Knowledge and Systems Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Knowledge and Systems Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE.2009.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper presents a method to identify the topics of documents based on Wikipedia category network. It is to improve the method previously proposed by Schonhofen by taking into account the weights of words in hyperlink texts in Wikipedia articles. The experiments on Computing and Team Sport domains have been carried out and showed that our proposed method outperforms the Schonhofen’s one.