On the ground of time series analyses, this paper excavates the trend in preliminary and carries out some short-term forecasts about future birthrate variation in wish to shed light on its solutions. This paper attempts to excavate the influence that several dominant factors exert on birthrates via regression analyses. The cluster analyses are utilized to filter out representative provinces as samples for subsequent research in order to tackle the problem that the correlations between dependant variable and each independent variable is not significant. Wielding SPSS as an analytical tool, we construct the regression model about how price level in combination with education level influences the birthrates.
{"title":"Change trend of birth rate of our country population and influence factor analysis","authors":"Ran Yu, Dan Li, Hongyun Gao, Shuang Chen","doi":"10.62051/d5kctv83","DOIUrl":"https://doi.org/10.62051/d5kctv83","url":null,"abstract":"On the ground of time series analyses, this paper excavates the trend in preliminary and carries out some short-term forecasts about future birthrate variation in wish to shed light on its solutions. This paper attempts to excavate the influence that several dominant factors exert on birthrates via regression analyses. The cluster analyses are utilized to filter out representative provinces as samples for subsequent research in order to tackle the problem that the correlations between dependant variable and each independent variable is not significant. Wielding SPSS as an analytical tool, we construct the regression model about how price level in combination with education level influences the birthrates.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"765 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140719124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid growth of artificial intelligence (AI) technology, traditional obstacle detection equipment faces multiple challenges such as high cost, low real-time performance, non normalization, dependence on manual operation, and time-consuming and labor-intensive. To address these shortcomings, this article proposes a deep learning (DL) based obstacle detection technology for autonomous driving on the road surface. As a complex system that integrates multiple key components such as environmental perception, positioning and navigation, path planning, and motion control, one of the core technologies of autonomous vehicles is accurate perception of the surrounding environment. In practical applications, autonomous vehicles often face complex and variable road environments, which may lead to a decrease in the quality of images captured by cameras, resulting in blurry and unclear phenomena. The DL method, especially the object detection algorithm, has shown unique advantages in visual perception and recognition in autonomous driving scenes. This paper deeply studies the obstacle detection technology of automatic driving road based on DL, aiming to achieve efficient and accurate obstacle recognition, improve the safety and reliability of auto drive system, and promote the further growth of automatic driving technology.
{"title":"Obstacle Detection Technology for Autonomous Driving Based on Deep Learning","authors":"Chenhao Gao","doi":"10.62051/c3evm786","DOIUrl":"https://doi.org/10.62051/c3evm786","url":null,"abstract":"With the rapid growth of artificial intelligence (AI) technology, traditional obstacle detection equipment faces multiple challenges such as high cost, low real-time performance, non normalization, dependence on manual operation, and time-consuming and labor-intensive. To address these shortcomings, this article proposes a deep learning (DL) based obstacle detection technology for autonomous driving on the road surface. As a complex system that integrates multiple key components such as environmental perception, positioning and navigation, path planning, and motion control, one of the core technologies of autonomous vehicles is accurate perception of the surrounding environment. In practical applications, autonomous vehicles often face complex and variable road environments, which may lead to a decrease in the quality of images captured by cameras, resulting in blurry and unclear phenomena. The DL method, especially the object detection algorithm, has shown unique advantages in visual perception and recognition in autonomous driving scenes. This paper deeply studies the obstacle detection technology of automatic driving road based on DL, aiming to achieve efficient and accurate obstacle recognition, improve the safety and reliability of auto drive system, and promote the further growth of automatic driving technology.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140716950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid progress of machine learning (ML) technology, more and more ML algorithms have emerged, and the complexity of models is also constantly increasing. This development trend brings two significant challenges in practice: how to choose appropriate algorithm models and how to optimize hyperparameters for these models. In this context, the concept of Automatic Machine Learning (AutoML) has emerged. Due to the applicability of different algorithm models to different data types and problem scenarios, it is crucial to automatically select the most suitable model based on the characteristics of specific tasks. AutoML integrates multiple ML algorithms and automatically filters based on the statistical characteristics of data and task requirements, aiming to provide users with the best model selection solution. Hyperparameters are parameters that ML models need to set before training, such as learning rate, number of iterations, regularization strength, etc., which have a significant impact on the performance of the model. AutoML integrates advanced hyperparameter optimization techniques to automatically find the optimal parameter combination, thereby improving the model's generalization ability and prediction accuracy. This article studies the automatic selection and parameter optimization of mathematical models based on ML.
随着机器学习(ML)技术的飞速发展,越来越多的 ML 算法应运而生,模型的复杂度也在不断提高。这种发展趋势在实践中带来了两个重大挑战:如何选择合适的算法模型以及如何优化这些模型的超参数。在此背景下,自动机器学习(AutoML)的概念应运而生。由于不同的算法模型适用于不同的数据类型和问题场景,因此根据特定任务的特点自动选择最合适的模型至关重要。AutoML 集成了多种 ML 算法,并根据数据的统计特征和任务要求进行自动筛选,旨在为用户提供最佳的模型选择解决方案。超参数是 ML 模型在训练前需要设置的参数,如学习率、迭代次数、正则化强度等,这些参数对模型的性能有重大影响。AutoML 集成了先进的超参数优化技术,可以自动找到最优参数组合,从而提高模型的泛化能力和预测精度。本文研究了基于 ML 的数学模型的自动选择和参数优化。
{"title":"Automatic Selection and Parameter Optimization of Mathematical Models Based on Machine Learning","authors":"Shuangbo Zhang","doi":"10.62051/nx5n1v79","DOIUrl":"https://doi.org/10.62051/nx5n1v79","url":null,"abstract":"With the rapid progress of machine learning (ML) technology, more and more ML algorithms have emerged, and the complexity of models is also constantly increasing. This development trend brings two significant challenges in practice: how to choose appropriate algorithm models and how to optimize hyperparameters for these models. In this context, the concept of Automatic Machine Learning (AutoML) has emerged. Due to the applicability of different algorithm models to different data types and problem scenarios, it is crucial to automatically select the most suitable model based on the characteristics of specific tasks. AutoML integrates multiple ML algorithms and automatically filters based on the statistical characteristics of data and task requirements, aiming to provide users with the best model selection solution. Hyperparameters are parameters that ML models need to set before training, such as learning rate, number of iterations, regularization strength, etc., which have a significant impact on the performance of the model. AutoML integrates advanced hyperparameter optimization techniques to automatically find the optimal parameter combination, thereby improving the model's generalization ability and prediction accuracy. This article studies the automatic selection and parameter optimization of mathematical models based on ML.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"5 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140717329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Based on the commodity information of the supermarket in the annex, the detailed data of historical sales flow, the wholesale price of vegetable commodities and the recent loss rate of vegetable commodities, and through the data analysis of each category and each single product, the automatic pricing and replenishment decision-making model of commodities is established. Use the optimization evaluation algorithm to formulate the total daily replenishment and pricing strategy of each category and each single product. In order to solve the first problem, firstly, the outliers in the original data of Annexes 2 and 3 are cleaned, normalized, feature selected and dimensionally reduced. Secondly, a quarter is taken as a sales cycle of supermarkets, so as to find the proportion of sales volume of a certain category in the same quarter of three years to the total sales volume, and give the distribution law of sales volume of different categories, the results are shown. Considering different periods again, the daily sales volume distribution law is calculated by taking one day as a sales cycle, and the results are shown. Finally, the Pearson grade correlation coefficient is used to judge the relationship between the processing indicators, and the matrix heat map is obtained. According to the two results, it was concluded that there was a significant positive correlation between the sales volume of mosaic and cauliflower vegetables, and a significant negative correlation between the sales volume of nightshade and aquatic root vegetables. In view of the second problem, firstly, considering the functional relationship between the total sales volume and the cost pricing, the correlation analysis and linear fitting were carried out to obtain the linear relationship between the sales price of each category and the maximum value of the sales volume of each category in July of the previous year can be described as Through further nonlinear fitting and optimization problem solving, the total daily replenishment volume and pricing strategy of each vegetable category in the coming week (July 1-7, 2023) are shown in Table 1 and Table 2, which makes the supermarket have the largest revenue In response to the third question, based on the known data, we can analyze the data requirements for each data: we need to know the sales volume of various vegetables during this period, we need to determine the purchase cost of each vegetable, we need to understand the past pricing strategy and response, and we need to know the inventory of various vegetables on June 30. On this basis, a multi-objective dynamic programming model is established, and the total number of saleable items is 30 by using the greedy algorithm to obtain the replenishment quantity of single items on July 1, and the pricing strategy is further solved by using the linear equation fitted in problem 2. In response to the fourth problem , on the basis of the existing sales, wholesale price and loss rate data, in
{"title":"Automated pricing and replenishment decisions for vegetable products based on evaluation optimization models","authors":"Zhichun Wei","doi":"10.62051/601jnn43","DOIUrl":"https://doi.org/10.62051/601jnn43","url":null,"abstract":"Based on the commodity information of the supermarket in the annex, the detailed data of historical sales flow, the wholesale price of vegetable commodities and the recent loss rate of vegetable commodities, and through the data analysis of each category and each single product, the automatic pricing and replenishment decision-making model of commodities is established. Use the optimization evaluation algorithm to formulate the total daily replenishment and pricing strategy of each category and each single product. In order to solve the first problem, firstly, the outliers in the original data of Annexes 2 and 3 are cleaned, normalized, feature selected and dimensionally reduced. Secondly, a quarter is taken as a sales cycle of supermarkets, so as to find the proportion of sales volume of a certain category in the same quarter of three years to the total sales volume, and give the distribution law of sales volume of different categories, the results are shown. Considering different periods again, the daily sales volume distribution law is calculated by taking one day as a sales cycle, and the results are shown. Finally, the Pearson grade correlation coefficient is used to judge the relationship between the processing indicators, and the matrix heat map is obtained. According to the two results, it was concluded that there was a significant positive correlation between the sales volume of mosaic and cauliflower vegetables, and a significant negative correlation between the sales volume of nightshade and aquatic root vegetables. In view of the second problem, firstly, considering the functional relationship between the total sales volume and the cost pricing, the correlation analysis and linear fitting were carried out to obtain the linear relationship between the sales price of each category and the maximum value of the sales volume of each category in July of the previous year can be described as Through further nonlinear fitting and optimization problem solving, the total daily replenishment volume and pricing strategy of each vegetable category in the coming week (July 1-7, 2023) are shown in Table 1 and Table 2, which makes the supermarket have the largest revenue In response to the third question, based on the known data, we can analyze the data requirements for each data: we need to know the sales volume of various vegetables during this period, we need to determine the purchase cost of each vegetable, we need to understand the past pricing strategy and response, and we need to know the inventory of various vegetables on June 30. On this basis, a multi-objective dynamic programming model is established, and the total number of saleable items is 30 by using the greedy algorithm to obtain the replenishment quantity of single items on July 1, and the pricing strategy is further solved by using the linear equation fitted in problem 2. In response to the fourth problem , on the basis of the existing sales, wholesale price and loss rate data, in","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140716943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper uses linear regression to quantitatively analyse the performance of players in the men's singles competition at Wimbledon 2023. Firstly, the data is processed by observationally analysing the match data to ensure compliance with the tournament standards and regulations. Next, key metrics were extracted, including short-term and long-term metrics, as well as the introduction of Serve Indicator to consider the impact of serve advantage on player performance. Then, the most important independent variables were identified through Random Forest feature analysis and parameters were calculated using least squares to construct performance indicators for use in linear regression. Finally, through data visualisation and analysis, it was found that player 1 usually performs better at critical moments, showing greater stability and consistency, while player 2 shows greater variability and unpredictability. Overall, the linear regression method in this paper is valuable and practical for quantifying tennis players' performance, and can provide a reference for players and coaches to help them better analyse and improve their performance.
{"title":"Quantifying Tennis Player Performance: A Linear Regression Approach","authors":"Yuxi Zeng, Siwei Zhong","doi":"10.62051/txzhx330","DOIUrl":"https://doi.org/10.62051/txzhx330","url":null,"abstract":"This paper uses linear regression to quantitatively analyse the performance of players in the men's singles competition at Wimbledon 2023. Firstly, the data is processed by observationally analysing the match data to ensure compliance with the tournament standards and regulations. Next, key metrics were extracted, including short-term and long-term metrics, as well as the introduction of Serve Indicator to consider the impact of serve advantage on player performance. Then, the most important independent variables were identified through Random Forest feature analysis and parameters were calculated using least squares to construct performance indicators for use in linear regression. Finally, through data visualisation and analysis, it was found that player 1 usually performs better at critical moments, showing greater stability and consistency, while player 2 shows greater variability and unpredictability. Overall, the linear regression method in this paper is valuable and practical for quantifying tennis players' performance, and can provide a reference for players and coaches to help them better analyse and improve their performance.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"1992 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140718842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cardiovascular and cerebrovascular diseases (CVDs), diabetes mellitus, malignant tumours and chronic obstructive pulmonary disease (COPD), as typical chronic non-communicable diseases (NCDs), have increasingly become the major factors threatening the health of Chinese residents. In view of this, this study aimed to deeply analyse the key indicators and their interrelationships affecting the health status of the population, using the eight dietary balance criteria proposed in the newly revised Dietary Guidelines for Chinese Residents of the Chinese Society of Nutrition to screen and process the relevant health indicators. By constructing a comprehensive evaluation model and visualising the data, this study reveals the irrational factors in residents' dietary habits, such as high-fat diet and excessive alcohol consumption, which pose potential risks to residents' health. In addition, this paper explores the relationship between residents' dietary habits, exercise frequency and healthy body weight, aiming to provide a scientific basis and practical guidance for improving public health.
{"title":"Dietary diversity and healthy weight management in the population: a data-driven analytical approach","authors":"Wenwen Zhang, Jianting Ye, Haodi Zhang","doi":"10.62051/gw2p5t41","DOIUrl":"https://doi.org/10.62051/gw2p5t41","url":null,"abstract":"Cardiovascular and cerebrovascular diseases (CVDs), diabetes mellitus, malignant tumours and chronic obstructive pulmonary disease (COPD), as typical chronic non-communicable diseases (NCDs), have increasingly become the major factors threatening the health of Chinese residents. In view of this, this study aimed to deeply analyse the key indicators and their interrelationships affecting the health status of the population, using the eight dietary balance criteria proposed in the newly revised Dietary Guidelines for Chinese Residents of the Chinese Society of Nutrition to screen and process the relevant health indicators. By constructing a comprehensive evaluation model and visualising the data, this study reveals the irrational factors in residents' dietary habits, such as high-fat diet and excessive alcohol consumption, which pose potential risks to residents' health. In addition, this paper explores the relationship between residents' dietary habits, exercise frequency and healthy body weight, aiming to provide a scientific basis and practical guidance for improving public health.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140717777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the continuous development of the domestic stock market and the continuous improvement of the financial system system, and at the same time, the domestic stock market gradually rises in the financial system, based on the prediction research of the domestic stock market will become more and more important. In order to solve the problems of low precision and poor accuracy of short-term stock price prediction, this paper selects the bi-directional long- and short-term memory network of attention mechanism (WOA-BiLSTM-Attenion) model under the whale optimization algorithm for stock price prediction. The modeling of bi-directional long- and short-term memory network with attention mechanism can reduce the loss of historical information and increase the influence of important information. On this basis, Whale Optimization Algorithm (WOA) is then used for hyperparameter selection to reduce human interference. The experimental results show that compared with BP, LSTM, BiLSTM, BiLSTM-Attention, the WOA-BiLSTM-Attenion model has a better effect on stock closing price prediction, with a value of 13.9446, and the value of 0.9477, which has a higher accuracy, with a view to providing certain reference for the prediction research in other fields.
{"title":"Research on Stock Price Prediction and Quantitative Stock Picking Strategy Based on Deep Learning","authors":"Jiahao Ji","doi":"10.62051/v47p3p43","DOIUrl":"https://doi.org/10.62051/v47p3p43","url":null,"abstract":"With the continuous development of the domestic stock market and the continuous improvement of the financial system system, and at the same time, the domestic stock market gradually rises in the financial system, based on the prediction research of the domestic stock market will become more and more important. In order to solve the problems of low precision and poor accuracy of short-term stock price prediction, this paper selects the bi-directional long- and short-term memory network of attention mechanism (WOA-BiLSTM-Attenion) model under the whale optimization algorithm for stock price prediction. The modeling of bi-directional long- and short-term memory network with attention mechanism can reduce the loss of historical information and increase the influence of important information. On this basis, Whale Optimization Algorithm (WOA) is then used for hyperparameter selection to reduce human interference. The experimental results show that compared with BP, LSTM, BiLSTM, BiLSTM-Attention, the WOA-BiLSTM-Attenion model has a better effect on stock closing price prediction, with a value of 13.9446, and the value of 0.9477, which has a higher accuracy, with a view to providing certain reference for the prediction research in other fields.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"2010 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140718552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Based on the analysis of the construction characteristics and passenger behavior characteristics of the standard station platform of rail transit, combined with the actual data, under the environment of the standard station, the distribution law of the platform passengers is studied. Through the change of different flow, the influence of the change on the overall distribution is studied, which provides a new idea and method for the study of the passenger distribution on the standard platform of rail transit.
{"title":"Passenger Distribution Regulation of Rail Transit Platform in Standard Station Environment","authors":"Shijie Zhang, Haihang Li, Hao Sun","doi":"10.62051/f7tqeh31","DOIUrl":"https://doi.org/10.62051/f7tqeh31","url":null,"abstract":"Based on the analysis of the construction characteristics and passenger behavior characteristics of the standard station platform of rail transit, combined with the actual data, under the environment of the standard station, the distribution law of the platform passengers is studied. Through the change of different flow, the influence of the change on the overall distribution is studied, which provides a new idea and method for the study of the passenger distribution on the standard platform of rail transit.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"56 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139167494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Currently most of image classification tasks are achieved by supervised learning. High-quality datasets naturally bring difficulties in annotation, and the datasets in real-world applications present a nonlinear structure, and the annotation cost grows exponentially with the number of targets and the difficulty of recognisability. In this context, research about unsupervised image classification is the way to go. Traditional unsupervised learning for classification is mostly based on the Euclidean distance and various paradigms, which is unable to extract the nonlinear structure of the dataset. This shortcoming makes the accuracy of traditional unsupervised image classification drop drastically. In this paper, we propose to first extract the nonlinear structure of the original dataset using the manifold learning method, and then produce pseudo-labels through the agglomerative clustering algorithm. The pseudo-labels obtained in this way can effectively retain the special mathematical structure of the original data with high accuracy. The neural network is trained with these pseudo labels to obtain an unsupervised usable image classifier. The classifier can be trained on small-scale data and then applied to large-scale data sets, thus saving the cost of manual labelling. The experiments are carried out by setting up a control group and two manifold learning groups for the extraction of non-linear structures using LLE and Isomap algorithms respectively. After that, the production of pseudo-labels and the training of neural networks are completed, and the accuracy of the three groups is compared. Finally, it is concluded that the correct rate of the two groups that have gone through the manifold learning algorithm to extract the nonlinear structure is much higher than that of the other one, and the image classifier based on the Isomap algorithm achieves an accuracy of 85% in the test set, which is highly practical.
目前,大多数图像分类任务都是通过监督学习实现的。高质量的数据集自然会给标注带来困难,而且实际应用中的数据集呈现非线性结构,标注成本会随着目标数量和识别难度的增加而呈指数增长。在这种情况下,有关无监督图像分类的研究就成了必由之路。传统的无监督分类学习大多基于欧氏距离和各种范式,无法提取数据集的非线性结构。这一缺陷使得传统无监督图像分类的准确率急剧下降。本文提出首先利用流形学习方法提取原始数据集的非线性结构,然后通过聚类算法生成伪标签。通过这种方法得到的伪标签可以有效地保留原始数据的特殊数学结构,而且准确率很高。利用这些伪标签对神经网络进行训练,就可以得到一个无监督的可用图像分类器。该分类器可在小规模数据上进行训练,然后应用于大规模数据集,从而节省了人工标注的成本。实验通过设立一个对照组和两个流形学习组,分别使用 LLE 算法和 Isomap 算法提取非线性结构。之后,完成伪标签的制作和神经网络的训练,并比较三组的准确性。最后得出结论:经过流形学习算法提取非线性结构的两组正确率远高于另一组,基于 Isomap 算法的图像分类器在测试集中的准确率达到了 85%,具有很强的实用性。
{"title":"Unsupervised Image Classifier based on Manifold Learning","authors":"Jinghao Situ","doi":"10.62051/31s5nw90","DOIUrl":"https://doi.org/10.62051/31s5nw90","url":null,"abstract":"Currently most of image classification tasks are achieved by supervised learning. High-quality datasets naturally bring difficulties in annotation, and the datasets in real-world applications present a nonlinear structure, and the annotation cost grows exponentially with the number of targets and the difficulty of recognisability. In this context, research about unsupervised image classification is the way to go. Traditional unsupervised learning for classification is mostly based on the Euclidean distance and various paradigms, which is unable to extract the nonlinear structure of the dataset. This shortcoming makes the accuracy of traditional unsupervised image classification drop drastically. In this paper, we propose to first extract the nonlinear structure of the original dataset using the manifold learning method, and then produce pseudo-labels through the agglomerative clustering algorithm. The pseudo-labels obtained in this way can effectively retain the special mathematical structure of the original data with high accuracy. The neural network is trained with these pseudo labels to obtain an unsupervised usable image classifier. The classifier can be trained on small-scale data and then applied to large-scale data sets, thus saving the cost of manual labelling. The experiments are carried out by setting up a control group and two manifold learning groups for the extraction of non-linear structures using LLE and Isomap algorithms respectively. After that, the production of pseudo-labels and the training of neural networks are completed, and the accuracy of the three groups is compared. Finally, it is concluded that the correct rate of the two groups that have gone through the manifold learning algorithm to extract the nonlinear structure is much higher than that of the other one, and the image classifier based on the Isomap algorithm achieves an accuracy of 85% in the test set, which is highly practical.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"31 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139166768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As an important part of the gas pipeline safety system, the venting pipe plays a role in protecting the safe operation of the network and reducing the consequences of accidents, and its design feasibility, rationality and safety are particularly important. In China, most of the online simulation software used for gas pipeline network venting simulation is limited to expensive, poorly controllable software and difficult to design for specific working conditions, so the application of dynamic simulation software for gas pipeline networks is of great importance. In this paper, simulation software is developed for city gas pipeline venting systems, simulations are carried out using a programming language, the usability of the developed software is verified and the influence of different influencing factors on the venting process is investigated using the control variable method of analysis.
{"title":"Research on Dynamic Venting Characteristics of City Gas Pipeline Networks and Analysis of Software Development","authors":"Yu Weng, Xiaomao Sun, Mengjiao Gou, Bohua Liu, Shuang Yang, Zhian Deng","doi":"10.62051/2xtdw078","DOIUrl":"https://doi.org/10.62051/2xtdw078","url":null,"abstract":"As an important part of the gas pipeline safety system, the venting pipe plays a role in protecting the safe operation of the network and reducing the consequences of accidents, and its design feasibility, rationality and safety are particularly important. In China, most of the online simulation software used for gas pipeline network venting simulation is limited to expensive, poorly controllable software and difficult to design for specific working conditions, so the application of dynamic simulation software for gas pipeline networks is of great importance. In this paper, simulation software is developed for city gas pipeline venting systems, simulations are carried out using a programming language, the usability of the developed software is verified and the influence of different influencing factors on the venting process is investigated using the control variable method of analysis.","PeriodicalId":509968,"journal":{"name":"Transactions on Computer Science and Intelligent Systems Research","volume":"21 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139167417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}