{"title":"A Topic Detection and Tracking Method Combining NLP with Suffix Tree Clustering","authors":"Yaohong Jin","doi":"10.1109/ICCSEE.2012.131","DOIUrl":null,"url":null,"abstract":"A topic detection and tracking method combining semantic analysis with Suffix Tree Clustering (STC) algorithm is presented. A feature selection using NLP algorithm was introduced to select the noun, verb and name entity as the input of STC. Focusing on the topic drifting, we formed the VSM of cluster by the key words extracted from the nodes of suffix tree by mutual information algorithm. After the similarity computing of clusters and topic detection and tracking, a semantic analysis was introduced to filter the words with same meaning and analyze the semantic structure of words in label of cluster. Finally a content-relevant description was generated for each topic. The experiments showed that this method can detect and track the topics from the news articles effectively.","PeriodicalId":132465,"journal":{"name":"2012 International Conference on Computer Science and Electronics Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Computer Science and Electronics Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSEE.2012.131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
A topic detection and tracking method combining semantic analysis with Suffix Tree Clustering (STC) algorithm is presented. A feature selection using NLP algorithm was introduced to select the noun, verb and name entity as the input of STC. Focusing on the topic drifting, we formed the VSM of cluster by the key words extracted from the nodes of suffix tree by mutual information algorithm. After the similarity computing of clusters and topic detection and tracking, a semantic analysis was introduced to filter the words with same meaning and analyze the semantic structure of words in label of cluster. Finally a content-relevant description was generated for each topic. The experiments showed that this method can detect and track the topics from the news articles effectively.