There are fraudulent promotion behaviors in GitHub, which promotes Stars and Forks for specific repositories. It is harmful to the environment of the open source community, while it is not effectively detected by GitHub yet. This paper applies a heterogeneous neural network to detect repositories that are suspected of fraudulent promotion behavior. A heterogenous mini-graph neural network with attention mechanism and hyper-graph generation is proposed to detect repositories with cheating behaviors. Attention mechanism can dynamically balance the weight of semantics in heterogeneous information networks. Hyper-graph generation method can solve the problem of poor connectivity caused by many small graphs in the dataset. The experimental result shows that the model can effectively detect this kind of cheating behavior.
{"title":"Fraudulent promotion detection on GitHub using heterogeneous neural network","authors":"Zexin Ning, Pengtao Pu, Jiashen Lin","doi":"10.1117/12.2667534","DOIUrl":"https://doi.org/10.1117/12.2667534","url":null,"abstract":"There are fraudulent promotion behaviors in GitHub, which promotes Stars and Forks for specific repositories. It is harmful to the environment of the open source community, while it is not effectively detected by GitHub yet. This paper applies a heterogeneous neural network to detect repositories that are suspected of fraudulent promotion behavior. A heterogenous mini-graph neural network with attention mechanism and hyper-graph generation is proposed to detect repositories with cheating behaviors. Attention mechanism can dynamically balance the weight of semantics in heterogeneous information networks. Hyper-graph generation method can solve the problem of poor connectivity caused by many small graphs in the dataset. The experimental result shows that the model can effectively detect this kind of cheating behavior.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129939645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the training of neural networks, the architecture is usually determined first and then the parameters are selected by an optimizer. The choice of architecture and parameters is often independent. Whenever the architecture is modified, an expensive retraining of the parameters is required. In this work, we focus on growing the architecture instead of the expensive retraining. There are two main ways to grow new neurons: splitting and adding. In this paper, we propose orthogonal initialization to mitigate the gradient vanish of the new adding neurons. We use QR decomposition to obtain orthogonal initialization. We performed detailed experiments on two datasets (CIFAR-10, CIFAR-100) and the experimental results show the efficiency of our method.
{"title":"Growing neural networks using orthogonal initialization","authors":"Xinglin Pan","doi":"10.1117/12.2667654","DOIUrl":"https://doi.org/10.1117/12.2667654","url":null,"abstract":"In the training of neural networks, the architecture is usually determined first and then the parameters are selected by an optimizer. The choice of architecture and parameters is often independent. Whenever the architecture is modified, an expensive retraining of the parameters is required. In this work, we focus on growing the architecture instead of the expensive retraining. There are two main ways to grow new neurons: splitting and adding. In this paper, we propose orthogonal initialization to mitigate the gradient vanish of the new adding neurons. We use QR decomposition to obtain orthogonal initialization. We performed detailed experiments on two datasets (CIFAR-10, CIFAR-100) and the experimental results show the efficiency of our method.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"12587 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131064019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aiming at the problems of fuzzy boundary and poor segmentation effect of SwinUNet in melanoma image segmentation, an improved SwinUNet network segmentation method was proposed. Firstly, Dice loss function is used to alleviate the background and regional imbalance. Secondly, each decoder layer is made to fuse the smaller scale from the encoder, the same scale feature map and the larger scale feature map from the decoder, so that the fine-grained semantics and coarse-grained semantics at the full scale can be captured . Finally, the size of the sliding window is increased, the receptive field of the model is enlarged, and the Dice coefficient is used to evaluate the segmentation results. The average Dice values of the original SwinUNet and the three improved models were 0.8311, 0.8689, 0.8719 and 0.8661, respectively. The experimental results show that the improved model proposed in this paper can effectively improve the accuracy of the original model, which is extremely important for the early diagnosis and treatment of melanoma.
{"title":"Research on melanoma image segmentation method based on improved SwinUNet","authors":"Zhenyue Zhu, Yingshu Lu","doi":"10.1117/12.2667246","DOIUrl":"https://doi.org/10.1117/12.2667246","url":null,"abstract":"Aiming at the problems of fuzzy boundary and poor segmentation effect of SwinUNet in melanoma image segmentation, an improved SwinUNet network segmentation method was proposed. Firstly, Dice loss function is used to alleviate the background and regional imbalance. Secondly, each decoder layer is made to fuse the smaller scale from the encoder, the same scale feature map and the larger scale feature map from the decoder, so that the fine-grained semantics and coarse-grained semantics at the full scale can be captured . Finally, the size of the sliding window is increased, the receptive field of the model is enlarged, and the Dice coefficient is used to evaluate the segmentation results. The average Dice values of the original SwinUNet and the three improved models were 0.8311, 0.8689, 0.8719 and 0.8661, respectively. The experimental results show that the improved model proposed in this paper can effectively improve the accuracy of the original model, which is extremely important for the early diagnosis and treatment of melanoma.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124795052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To improve the accuracy of the power supply enterprise service network public opinion crisis early warning, the fuzzy reasoning theory is introduced to carry out the design research of the power supply enterprise service network public opinion early warning method. Based on public opinion topic intensity, development heat and public attitude, the power supply enterprise service network public opinion early warning index system is constructed. Combined with fuzzy reasoning theory, the index membership degree and early warning level membership degree are calculated. Through the learning method, the public opinion early warning level judgment rule is learned, and the public opinion early warning level judgment and early warning display are completed. The experiment proves that the new public opinion early warning method can accurately judge the degree of public opinion crisis, and give a reasonable and intuitive early warning display result.
{"title":"Early warning method of power supply enterprise service network public opinion based on fuzzy reasoning","authors":"Qianqian Li, Wenjie Fan, Xiaozhou Shen, Jing Li","doi":"10.1117/12.2667502","DOIUrl":"https://doi.org/10.1117/12.2667502","url":null,"abstract":"To improve the accuracy of the power supply enterprise service network public opinion crisis early warning, the fuzzy reasoning theory is introduced to carry out the design research of the power supply enterprise service network public opinion early warning method. Based on public opinion topic intensity, development heat and public attitude, the power supply enterprise service network public opinion early warning index system is constructed. Combined with fuzzy reasoning theory, the index membership degree and early warning level membership degree are calculated. Through the learning method, the public opinion early warning level judgment rule is learned, and the public opinion early warning level judgment and early warning display are completed. The experiment proves that the new public opinion early warning method can accurately judge the degree of public opinion crisis, and give a reasonable and intuitive early warning display result.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121600613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the outbreak of covid-19 in 2020, timely and effective diagnosis and treatment of each covid-19 patient is particularly important. This paper combines the advantages of deep learning in image recognition, takes RESNET as the basic network framework, and carries out the experiment of improving the residual structure on this basis. It is tested on the open source new coronal chest radiograph data set, and the accuracy rate is 82.3%. Through a series of experiments, the training model has the advantages of good generalization, high accuracy and fast convergence. This paper proves the feasibility of the improved residual neural network in the diagnosis of covid-19.
{"title":"Research on chest x-ray image diagnosis of COVID-19 based on improved ResNet","authors":"J. Sun","doi":"10.1117/12.2667614","DOIUrl":"https://doi.org/10.1117/12.2667614","url":null,"abstract":"With the outbreak of covid-19 in 2020, timely and effective diagnosis and treatment of each covid-19 patient is particularly important. This paper combines the advantages of deep learning in image recognition, takes RESNET as the basic network framework, and carries out the experiment of improving the residual structure on this basis. It is tested on the open source new coronal chest radiograph data set, and the accuracy rate is 82.3%. Through a series of experiments, the training model has the advantages of good generalization, high accuracy and fast convergence. This paper proves the feasibility of the improved residual neural network in the diagnosis of covid-19.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130481975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Infrared small vehicle target detection plays an important role in infrared search and tracking systems applications. The target detection methods based on deep learning are developing rapidly, but the existing approaches always perform poorly for the detection of small target. In this study, we propose an improved SSD(Single Shot MultiBox Detector) to improve the detection performance of infrared small targets from three aspects. First of all, we recommend using the stride convolution layer to replace the 3~6 maximum pooling layers in the original algorithm; second, design a shallow feature layer information enhancement module, semantically fusing the feature maps of the shallow feature layer and the deep feature layer, and using a new pyramid structure to detect the target; third, introducing residual unit and use the MSRA function to initialize the weights of the neurons in each layer at the beginning of training. To evaluate the Infrared-SSD proposed in this paper, the infrared vehicle data set created by this team was used to train and test the model. Experimental results show that Infrared-SSD has higher accuracy than the original SSD algorithm. For an input of 300pixel×300pixel, Infrared-SSD got a mAP(mean Average Precision) test score of 82.02%.
红外小型车辆目标检测在红外搜索跟踪系统中占有重要的地位。基于深度学习的目标检测方法发展迅速,但现有的方法在小目标检测方面表现不佳。在本研究中,我们提出了一种改进的SSD(Single Shot MultiBox Detector),从三个方面提高红外小目标的检测性能。首先,我们建议使用跨步卷积层代替原算法中的3~6个最大池化层;其次,设计浅层特征层信息增强模块,将浅层特征层和深层特征层的特征图进行语义融合,并采用新的金字塔结构对目标进行检测;第三,引入残差单元,在训练开始时使用MSRA函数初始化每层神经元的权值。为了对本文提出的红外固态硬盘进行评估,使用该团队创建的红外车辆数据集对模型进行训练和测试。实验结果表明,红外固态硬盘算法比原始固态硬盘算法具有更高的精度。对于输入300pixel×300pixel, Infrared-SSD的mAP(mean Average Precision)测试得分为82.02%。
{"title":"An infrared small vehicle target detection method based on deep learning","authors":"Xiaofeng Zhao, Yuting Xia, Mingyang Xu, wewen zhang, Jiahui Niu, Zhili Zhang","doi":"10.1117/12.2667313","DOIUrl":"https://doi.org/10.1117/12.2667313","url":null,"abstract":"Infrared small vehicle target detection plays an important role in infrared search and tracking systems applications. The target detection methods based on deep learning are developing rapidly, but the existing approaches always perform poorly for the detection of small target. In this study, we propose an improved SSD(Single Shot MultiBox Detector) to improve the detection performance of infrared small targets from three aspects. First of all, we recommend using the stride convolution layer to replace the 3~6 maximum pooling layers in the original algorithm; second, design a shallow feature layer information enhancement module, semantically fusing the feature maps of the shallow feature layer and the deep feature layer, and using a new pyramid structure to detect the target; third, introducing residual unit and use the MSRA function to initialize the weights of the neurons in each layer at the beginning of training. To evaluate the Infrared-SSD proposed in this paper, the infrared vehicle data set created by this team was used to train and test the model. Experimental results show that Infrared-SSD has higher accuracy than the original SSD algorithm. For an input of 300pixel×300pixel, Infrared-SSD got a mAP(mean Average Precision) test score of 82.02%.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131714832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanyin Xie, Rui Yang, Ruihan Hu, Lin Gan, Hualin Ke
This paper focuses on how to ensure the availability and effectiveness of massive cloud data for industrial robots in the flexible production line, address the technical challenge in building a massive data cloud platform for industrial robots, and resolve the engineering problem of cloud based industrial robot cloud service application. To achieve this purpose, research is conducted on industrial robot hybrid cloud platform architecture, network technology, industrial robot big data system, autonomous learning cloud data processing and other technologies, which provides support for cloud service applications. It is suggested to combine knowledge atlas, digital twins, deep neural network, migration learning and other artificial intelligence technologies, which is conducive to remote monitoring and fault diagnosis cloud service applications. This has been verified and promoted in the handling, polishing, stacking, welding, assembly and other robots in 3C, mold, household appliances, automobile, furniture, electronic equipment manufacturing and other industries.
{"title":"Flexible production line product research and development and manufacturing cloud platform based on intelligent data collaboration","authors":"Yanyin Xie, Rui Yang, Ruihan Hu, Lin Gan, Hualin Ke","doi":"10.1117/12.2667215","DOIUrl":"https://doi.org/10.1117/12.2667215","url":null,"abstract":"This paper focuses on how to ensure the availability and effectiveness of massive cloud data for industrial robots in the flexible production line, address the technical challenge in building a massive data cloud platform for industrial robots, and resolve the engineering problem of cloud based industrial robot cloud service application. To achieve this purpose, research is conducted on industrial robot hybrid cloud platform architecture, network technology, industrial robot big data system, autonomous learning cloud data processing and other technologies, which provides support for cloud service applications. It is suggested to combine knowledge atlas, digital twins, deep neural network, migration learning and other artificial intelligence technologies, which is conducive to remote monitoring and fault diagnosis cloud service applications. This has been verified and promoted in the handling, polishing, stacking, welding, assembly and other robots in 3C, mold, household appliances, automobile, furniture, electronic equipment manufacturing and other industries.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123623992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the rapid development of artificial intelligence technology, emotion recognition has been applied in all aspects of life, using eye movement tracking technology for emotion recognition has become an important branch of emotion computing. In order to explore the relationship between eye movement signals and learners' emotional states in the online video learning environment, we used machine learning and convolutional neural network methods to recognize eye movement signals, and classify learners' emotional states into two categories, positive and negative. The study of eye movement data under different time windows mainly includes four stages: data collection, data preprocessing, classifier modeling, training and testing. In this paper, a Eye-movement Feature Extraction Classification Network(EFECN) based on convolutional neural network is proposed for small sample data and the classification of emotion state based on eye movement. The eye movement data were transformed into images through cross-modal conversion as input of multiple different deep convolutional neural networks, and the emotional states were classified in positive and negative directions. The accuracy was used as the evaluation index to evaluate and compare the different models. The accuracy of the eye movement emotion recognition model reached 72% in the SVM model and 91.62% in the EFECN model. Experimental results show that the convolutional neural network based on deep learning has a significant improvement in recognition accuracy compared with traditional machine learning methods.
{"title":"Emotion recognition research of eye-movement feature extraction classification network in online video learning environment","authors":"Shengxi Liu, Ze-ping Li, Xiaomei Tao","doi":"10.1117/12.2667404","DOIUrl":"https://doi.org/10.1117/12.2667404","url":null,"abstract":"With the rapid development of artificial intelligence technology, emotion recognition has been applied in all aspects of life, using eye movement tracking technology for emotion recognition has become an important branch of emotion computing. In order to explore the relationship between eye movement signals and learners' emotional states in the online video learning environment, we used machine learning and convolutional neural network methods to recognize eye movement signals, and classify learners' emotional states into two categories, positive and negative. The study of eye movement data under different time windows mainly includes four stages: data collection, data preprocessing, classifier modeling, training and testing. In this paper, a Eye-movement Feature Extraction Classification Network(EFECN) based on convolutional neural network is proposed for small sample data and the classification of emotion state based on eye movement. The eye movement data were transformed into images through cross-modal conversion as input of multiple different deep convolutional neural networks, and the emotional states were classified in positive and negative directions. The accuracy was used as the evaluation index to evaluate and compare the different models. The accuracy of the eye movement emotion recognition model reached 72% in the SVM model and 91.62% in the EFECN model. Experimental results show that the convolutional neural network based on deep learning has a significant improvement in recognition accuracy compared with traditional machine learning methods.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"09 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124536872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It's hard for the classic TextRank algorithm to differentiate the degree of association between candidate keyword nodes. Furthermore, it readily ignores the long-distance syntactic relations and topic semantic information between words while extracting keywords from a document. For the purpose of solving this problem, we propose an improved TextRank algorithm utilizing lexical, grammatical, and semantic features to find objective keywords from Chinese academic text. Firstly, we construct the word graph of candidate keywords after text preprocessing. Secondly, we integrate multidimensional features of candidate words into the primary calculation of the transition probability matrix. In this regard, our approach mines the full text to extract a collection of grammatical and morphological features (such as part-of-speech, word position, long-distance dependencies, and distinguished BERT dynamic semantic information). By introducing the dependency syntax of long sentences, the algorithm's ability to identify low-frequency topic keywords is obviously promotional. In addition, the external semantic information is designed to be imported through the word embedding model. A merged feature-based matrix is then employed to calculate the influence of all candidate keyword nodes with the iterative formula of PageRank. Namely, we attain a set of satisfactory keywords by ranking candidate nodes according to their comprehensive influence scores and selecting the ultimate top N keywords. This paper utilizes public data sets to verify the effectiveness of the proposed algorithm. Our approach achieves comparable f-scores with a 5.5% improvement (4 keywords) over the classic. The experimental results demonstrate that our approach can expand the degree of association differentiation between nodes better by mining synthetic long text features. Besides, the results also show that the proposed algorithm is more promising and its extraction effect is more robust than previously studied ensemble methods.
{"title":"Automatic keyword extraction based on dependency parsing and BERT semantic weighting","authors":"Huixin Liu","doi":"10.1117/12.2667242","DOIUrl":"https://doi.org/10.1117/12.2667242","url":null,"abstract":"It's hard for the classic TextRank algorithm to differentiate the degree of association between candidate keyword nodes. Furthermore, it readily ignores the long-distance syntactic relations and topic semantic information between words while extracting keywords from a document. For the purpose of solving this problem, we propose an improved TextRank algorithm utilizing lexical, grammatical, and semantic features to find objective keywords from Chinese academic text. Firstly, we construct the word graph of candidate keywords after text preprocessing. Secondly, we integrate multidimensional features of candidate words into the primary calculation of the transition probability matrix. In this regard, our approach mines the full text to extract a collection of grammatical and morphological features (such as part-of-speech, word position, long-distance dependencies, and distinguished BERT dynamic semantic information). By introducing the dependency syntax of long sentences, the algorithm's ability to identify low-frequency topic keywords is obviously promotional. In addition, the external semantic information is designed to be imported through the word embedding model. A merged feature-based matrix is then employed to calculate the influence of all candidate keyword nodes with the iterative formula of PageRank. Namely, we attain a set of satisfactory keywords by ranking candidate nodes according to their comprehensive influence scores and selecting the ultimate top N keywords. This paper utilizes public data sets to verify the effectiveness of the proposed algorithm. Our approach achieves comparable f-scores with a 5.5% improvement (4 keywords) over the classic. The experimental results demonstrate that our approach can expand the degree of association differentiation between nodes better by mining synthetic long text features. Besides, the results also show that the proposed algorithm is more promising and its extraction effect is more robust than previously studied ensemble methods.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125486985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Federated Learning (FL) protects users’ privacy by only uploading the training result instead of gathering all the private data. However, achieving the desired model performance often requires a large number of iterations of parameter transfer between client and central server. Currently, selecting a fixed number of clients to participate in training can slightly reduce the communication overhead during model training, but ignore the impact on model training accuracy. In this paper, we propose an adaptive chosen client number K scheme, which can give a better tradeoff between accuracy and cost. Firstly, through experiments, we find that increasing extracted clients’ number K can reduce iterations’ number T, but after K increases to a certain extent (𝐾1), T will no longer reduce significantly. Similarly, increasing K can further improve the accuracy of model training, but K is large enough (𝐾2 ≥ 𝐾1), the accuracy will also no more be improved remarkably. Thus, [𝐾1,𝐾2] is the optimal range. Secondly, we conduct experiments on different datasets with different number of clients, and find that the optimal client’s number growth rate q’ for different conditions is 0.02. According to the experimental results, we set the initial K to be 𝐾1 for the optimal T, when the model update magnitude in two adjacent iterations is less than a threshold, the number of clients participating in training will increase by q’ to speed up the convergence until K reaches K2, otherwise it will remain unchanged. Finally, we use our algorithm to improve present FL algorithms. Through experiments, we demonstrate that our algorithm is suitable for existing differential private FL algorithms.
{"title":"An efficient differential privacy federated learning scheme with optimal adaptive client number K","authors":"Jian Wang, Mengwei Zhang","doi":"10.1117/12.2667280","DOIUrl":"https://doi.org/10.1117/12.2667280","url":null,"abstract":"Federated Learning (FL) protects users’ privacy by only uploading the training result instead of gathering all the private data. However, achieving the desired model performance often requires a large number of iterations of parameter transfer between client and central server. Currently, selecting a fixed number of clients to participate in training can slightly reduce the communication overhead during model training, but ignore the impact on model training accuracy. In this paper, we propose an adaptive chosen client number K scheme, which can give a better tradeoff between accuracy and cost. Firstly, through experiments, we find that increasing extracted clients’ number K can reduce iterations’ number T, but after K increases to a certain extent (𝐾1), T will no longer reduce significantly. Similarly, increasing K can further improve the accuracy of model training, but K is large enough (𝐾2 ≥ 𝐾1), the accuracy will also no more be improved remarkably. Thus, [𝐾1,𝐾2] is the optimal range. Secondly, we conduct experiments on different datasets with different number of clients, and find that the optimal client’s number growth rate q’ for different conditions is 0.02. According to the experimental results, we set the initial K to be 𝐾1 for the optimal T, when the model update magnitude in two adjacent iterations is less than a threshold, the number of clients participating in training will increase by q’ to speed up the convergence until K reaches K2, otherwise it will remain unchanged. Finally, we use our algorithm to improve present FL algorithms. Through experiments, we demonstrate that our algorithm is suitable for existing differential private FL algorithms.","PeriodicalId":128051,"journal":{"name":"Third International Seminar on Artificial Intelligence, Networking, and Information Technology","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117313955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}