Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.132016
志浩 吴
The research is on the primary potential high-quality user division of short video official unau-吴志浩
{"title":"Short Video Account Shallow Security Level Identification Method—Preliminary Division of Potential High-Quality Accounts for Short Videos","authors":"志浩 吴","doi":"10.12677/hjdm.2023.132016","DOIUrl":"https://doi.org/10.12677/hjdm.2023.132016","url":null,"abstract":"The research is on the primary potential high-quality user division of short video official unau-吴志浩","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77421074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.133022
梅 袁
In the era of big data, data
{"title":"Fast Attribute Reduction Algorithm Based on Maximum Decision Entropy","authors":"梅 袁","doi":"10.12677/hjdm.2023.133022","DOIUrl":"https://doi.org/10.12677/hjdm.2023.133022","url":null,"abstract":"In the era of big data, data","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73536814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.132017
晓晴 洪
With the continuous development and improvement of communication network engineering and new infrastructure technologies, China is gradually realizing the transition from a 4G society to a 5G society. 5G, with its technical advantages of low latency, large bandwidth and wide connectivity, has become an important technical background for the construction of smart cities and digital villages. In order to achieve the conditions for large-scale connectivity of 5G networks required for the construction of smart cities, a higher utilization rate of 5G users is required. Based on this problem, this paper obtains data from a mobile big data platform, builds a classification prediction model based on the prediction problem of potential 5G users, correctly identifies potential 5G users and makes accurate service recommendations to them, improves the 5G utilization rate in China, and promotes the rapid upgrade of the construction of new smart cities. The process of building the prediction model mainly includes data pre-processing, feature engineering, training and evaluation of the model. Firstly, data pre-processing and exploratory analysis were performed, and a series of pre-processing work including data cleaning, removal of unique value attributes, data transformation, etc. were carried out for the data, followed by variable screening of the features in the dataset of this paper through chi-square test, statistical t-test and Pearson correlation coefficient method, and 24 feature variables with high feature importance were screened out. Models were constructed based on the screened feature variables, including Random Forest model, CatBoost model, and LightGBM model, and parameter tuning was performed to find the optimal parameters. The models are built according to the obtained optimal parameters and tested by the test set, and the models are evaluated by accuracy, recall, and AUC value indexes, and the comparison reveals that the LightGBM model is generally better than other models for 5G potential user prediction. In addition, the importance scores of the features are obtained by the above model and ranked in importance. Through the method of this paper to achieve more accurate identification and mining of 5G potential users, operators can accordingly realize accurate marketing for different customers, promote more users to realize the transition from 4G to 5G, and accelerate the sustainable development of China’s 5G market and the construction of smart cities.
{"title":"Analysis and Mining of 5G Potential Customers Based on Machine Learning","authors":"晓晴 洪","doi":"10.12677/hjdm.2023.132017","DOIUrl":"https://doi.org/10.12677/hjdm.2023.132017","url":null,"abstract":"With the continuous development and improvement of communication network engineering and new infrastructure technologies, China is gradually realizing the transition from a 4G society to a 5G society. 5G, with its technical advantages of low latency, large bandwidth and wide connectivity, has become an important technical background for the construction of smart cities and digital villages. In order to achieve the conditions for large-scale connectivity of 5G networks required for the construction of smart cities, a higher utilization rate of 5G users is required. Based on this problem, this paper obtains data from a mobile big data platform, builds a classification prediction model based on the prediction problem of potential 5G users, correctly identifies potential 5G users and makes accurate service recommendations to them, improves the 5G utilization rate in China, and promotes the rapid upgrade of the construction of new smart cities. The process of building the prediction model mainly includes data pre-processing, feature engineering, training and evaluation of the model. Firstly, data pre-processing and exploratory analysis were performed, and a series of pre-processing work including data cleaning, removal of unique value attributes, data transformation, etc. were carried out for the data, followed by variable screening of the features in the dataset of this paper through chi-square test, statistical t-test and Pearson correlation coefficient method, and 24 feature variables with high feature importance were screened out. Models were constructed based on the screened feature variables, including Random Forest model, CatBoost model, and LightGBM model, and parameter tuning was performed to find the optimal parameters. The models are built according to the obtained optimal parameters and tested by the test set, and the models are evaluated by accuracy, recall, and AUC value indexes, and the comparison reveals that the LightGBM model is generally better than other models for 5G potential user prediction. In addition, the importance scores of the features are obtained by the above model and ranked in importance. Through the method of this paper to achieve more accurate identification and mining of 5G potential users, operators can accordingly realize accurate marketing for different customers, promote more users to realize the transition from 4G to 5G, and accelerate the sustainable development of China’s 5G market and the construction of smart cities.","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78723199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.133021
晓敏 梁
With the rapid development of information technology, a large amount of data has been generated
{"title":"Researches on Feature Selection Algorithm for Generalized Multi-Granularity Rough Sets","authors":"晓敏 梁","doi":"10.12677/hjdm.2023.133021","DOIUrl":"https://doi.org/10.12677/hjdm.2023.133021","url":null,"abstract":"With the rapid development of information technology, a large amount of data has been generated","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79047545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.131004
海鹏 刘
{"title":"Analysis on Domestic Literature Related to Novel Coronavirus","authors":"海鹏 刘","doi":"10.12677/hjdm.2023.131004","DOIUrl":"https://doi.org/10.12677/hjdm.2023.131004","url":null,"abstract":"","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"103 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80365808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.133026
莉婷 韩
The new insurance contract standard came into effect on January 1 this year, marking the further improvement of China’s accounting standards system for enterprises and maintaining continuous convergence with IFRS. Therefore, this paper uses the Python programming language to conduct
{"title":"Analysis of Old and New Insurance Contract Guidelines Based on Text Mining","authors":"莉婷 韩","doi":"10.12677/hjdm.2023.133026","DOIUrl":"https://doi.org/10.12677/hjdm.2023.133026","url":null,"abstract":"The new insurance contract standard came into effect on January 1 this year, marking the further improvement of China’s accounting standards system for enterprises and maintaining continuous convergence with IFRS. Therefore, this paper uses the Python programming language to conduct","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"156 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75325522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.131009
泽凯 陈
{"title":"The Impact of Rural Population Aging on Agricultural Development in Guangdong Province","authors":"泽凯 陈","doi":"10.12677/hjdm.2023.131009","DOIUrl":"https://doi.org/10.12677/hjdm.2023.131009","url":null,"abstract":"","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"71 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78401983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.132010
晓筝 李
{"title":"Prediction Research on the Scale of Middle School Students in Jilin Province Based on BP Neural Network","authors":"晓筝 李","doi":"10.12677/hjdm.2023.132010","DOIUrl":"https://doi.org/10.12677/hjdm.2023.132010","url":null,"abstract":"","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80239727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.133025
飞燕 马
With the development of the economy, online shopping has gained widespread popularity in all aspects. Due to its advantages such as convenience, speed, time and effort saving, and door-to-door delivery, it is increasingly favored by people and has become an indispensable part of daily life. With the improvement of people’s economic ability and consumption level, the demand for on-line shopping experience is also increasing. At the same time, competition among major online retail businesses has become increasingly fierce. In order to attract consumers’ attention and increase product sales, some businesses have started to use “speculation” and “order” methods such as selling, positive reviews, and negative reviews to maliciously promote products, in-fringing on consumers’ rights and interests. To protect consumers’ right to know and choose, this project uses a dataset provided by Inspur Zhuosu Company to analyze the reasons for abnormal products through a combination of quantitative and qualitative data mining analysis. Mathematical modeling and machine learning methods are used to define some abnormal product indicators, and these indicators are used to construct a model for finding and predicting abnormal products. The experimental results indicate that the model has good performance and certain practicality.
{"title":"Research and Implementation of Abnormal Product Identification Model Based on Stability","authors":"飞燕 马","doi":"10.12677/hjdm.2023.133025","DOIUrl":"https://doi.org/10.12677/hjdm.2023.133025","url":null,"abstract":"With the development of the economy, online shopping has gained widespread popularity in all aspects. Due to its advantages such as convenience, speed, time and effort saving, and door-to-door delivery, it is increasingly favored by people and has become an indispensable part of daily life. With the improvement of people’s economic ability and consumption level, the demand for on-line shopping experience is also increasing. At the same time, competition among major online retail businesses has become increasingly fierce. In order to attract consumers’ attention and increase product sales, some businesses have started to use “speculation” and “order” methods such as selling, positive reviews, and negative reviews to maliciously promote products, in-fringing on consumers’ rights and interests. To protect consumers’ right to know and choose, this project uses a dataset provided by Inspur Zhuosu Company to analyze the reasons for abnormal products through a combination of quantitative and qualitative data mining analysis. Mathematical modeling and machine learning methods are used to define some abnormal product indicators, and these indicators are used to construct a model for finding and predicting abnormal products. The experimental results indicate that the model has good performance and certain practicality.","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84031610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.12677/hjdm.2023.131008
庆权 孟
{"title":"Study on Pedestrian Multi-Object Tracking Based on Iterative Strategy","authors":"庆权 孟","doi":"10.12677/hjdm.2023.131008","DOIUrl":"https://doi.org/10.12677/hjdm.2023.131008","url":null,"abstract":"","PeriodicalId":57348,"journal":{"name":"数据挖掘","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88729824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}