{"title":"Micro-blog hot topics detection method based on user role orientation: Micro-blog hot topics detection method based on user role orientation","authors":"Wu Yang, Yanghao Li, Ling Lu","doi":"10.3724/SP.J.1087.2013.03076","DOIUrl":null,"url":null,"abstract":"To solve the low extraction efficiency for extracting hot topics in huge amounts of micro-blog data, a new topics detection method based on user role orientation was proposed. Firstly, some noise data of parts of users were filtered out by user role orientation. Secondly, the feature weight was calculated by the Term Frequency-Inverse Document Frequency( TFIDF) function combined with semantic similarity to reduce the error caused by semantic expression. Then, the improved Single-Pass clustering algorithm was used to extract the topics of micro-blog. Lastly, the heat evaluation of micro-blog topics was made according to the number of reposts and comments, thus the hot topics were found. The results show that the average missing rate and false detection rate respectively decrease by 12. 09% and 2. 37%, and further indicate the topic detection accuracy rate is effectively improved and the method is feasible.","PeriodicalId":61778,"journal":{"name":"计算机应用","volume":"33 1","pages":"3076-3079"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"计算机应用","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.3724/SP.J.1087.2013.03076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
To solve the low extraction efficiency for extracting hot topics in huge amounts of micro-blog data, a new topics detection method based on user role orientation was proposed. Firstly, some noise data of parts of users were filtered out by user role orientation. Secondly, the feature weight was calculated by the Term Frequency-Inverse Document Frequency( TFIDF) function combined with semantic similarity to reduce the error caused by semantic expression. Then, the improved Single-Pass clustering algorithm was used to extract the topics of micro-blog. Lastly, the heat evaluation of micro-blog topics was made according to the number of reposts and comments, thus the hot topics were found. The results show that the average missing rate and false detection rate respectively decrease by 12. 09% and 2. 37%, and further indicate the topic detection accuracy rate is effectively improved and the method is feasible.