{"title":"Micro-blog category based on feature-words category dispersion","authors":"Yingyou Chen, Qing Wu","doi":"10.1109/CCIS.2012.6664377","DOIUrl":null,"url":null,"abstract":"The micro-blog information classification is an important pretreatment in micro-blog data processing work. Due to the unique properties of the micro-blog text, there are some limitations when use traditional classification to deal with it. Consider to a single microblog text brief which contains less effective feature-words, and the content compare spoken of the features, this paper proposed to use similar words and collocations to extend the text feature-words, reducing the possibility of feature loss. For the feature of information selection and weight calculation, proposed one kind text classification methods which based on the feature-words category dispersion and dispersion degree. The experiments show that the propose classification method achieves good effects in the classification of micro-blog text, and has better applicability in microblog text classification scene.","PeriodicalId":392558,"journal":{"name":"2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS.2012.6664377","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The micro-blog information classification is an important pretreatment in micro-blog data processing work. Due to the unique properties of the micro-blog text, there are some limitations when use traditional classification to deal with it. Consider to a single microblog text brief which contains less effective feature-words, and the content compare spoken of the features, this paper proposed to use similar words and collocations to extend the text feature-words, reducing the possibility of feature loss. For the feature of information selection and weight calculation, proposed one kind text classification methods which based on the feature-words category dispersion and dispersion degree. The experiments show that the propose classification method achieves good effects in the classification of micro-blog text, and has better applicability in microblog text classification scene.