{"title":"基于机器学习的微博热点事件用户群分析","authors":"Bingyun Lin, Xin Zhu, Jianming Hu","doi":"10.1109/cost57098.2022.00065","DOIUrl":null,"url":null,"abstract":"The construction of Weibo user profiles in hot events can help to grasp the characteristics of Weibo users involved in the events, which is conductive to the relevant department to strengthen public opinion guidance and propaganda education. Taking the “Viya's tax evasion” case as an example, firstly, the Latent Dirichlet allocation (LDA) topic model is used to construct a topic model ofWeibo content in the case, and the optimal number of topics is determined by perplexity. Then, the k-prototype algorithm and the improved Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm are respectively used to cluster Weibo users and analyze the similarities and differences between the various categories of users. At last, the clustering results of the two algorithms are compared. The experiments show that the topic generation method based on the LDA topic model has a good effect on describing discussion topics. In the process of data containing mixed attributes, the k-Prototype algorithm and the improved DBSCAN algorithm have their respective advantages, and the combined results of the two algorithms can obtain a more complete user group portrait.","PeriodicalId":135595,"journal":{"name":"2022 International Conference on Culture-Oriented Science and Technology (CoST)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning-based Weibo user group profiling under hot events\",\"authors\":\"Bingyun Lin, Xin Zhu, Jianming Hu\",\"doi\":\"10.1109/cost57098.2022.00065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The construction of Weibo user profiles in hot events can help to grasp the characteristics of Weibo users involved in the events, which is conductive to the relevant department to strengthen public opinion guidance and propaganda education. Taking the “Viya's tax evasion” case as an example, firstly, the Latent Dirichlet allocation (LDA) topic model is used to construct a topic model ofWeibo content in the case, and the optimal number of topics is determined by perplexity. Then, the k-prototype algorithm and the improved Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm are respectively used to cluster Weibo users and analyze the similarities and differences between the various categories of users. At last, the clustering results of the two algorithms are compared. The experiments show that the topic generation method based on the LDA topic model has a good effect on describing discussion topics. In the process of data containing mixed attributes, the k-Prototype algorithm and the improved DBSCAN algorithm have their respective advantages, and the combined results of the two algorithms can obtain a more complete user group portrait.\",\"PeriodicalId\":135595,\"journal\":{\"name\":\"2022 International Conference on Culture-Oriented Science and Technology (CoST)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Culture-Oriented Science and Technology (CoST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cost57098.2022.00065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Culture-Oriented Science and Technology (CoST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cost57098.2022.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine Learning-based Weibo user group profiling under hot events
The construction of Weibo user profiles in hot events can help to grasp the characteristics of Weibo users involved in the events, which is conductive to the relevant department to strengthen public opinion guidance and propaganda education. Taking the “Viya's tax evasion” case as an example, firstly, the Latent Dirichlet allocation (LDA) topic model is used to construct a topic model ofWeibo content in the case, and the optimal number of topics is determined by perplexity. Then, the k-prototype algorithm and the improved Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm are respectively used to cluster Weibo users and analyze the similarities and differences between the various categories of users. At last, the clustering results of the two algorithms are compared. The experiments show that the topic generation method based on the LDA topic model has a good effect on describing discussion topics. In the process of data containing mixed attributes, the k-Prototype algorithm and the improved DBSCAN algorithm have their respective advantages, and the combined results of the two algorithms can obtain a more complete user group portrait.