Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543367
Chao Deng, Zhaohui Ma
Decision tree is a kind of machine learning method which can decide a corresponding result according to the probability of different eigenvalues, The effective decision number constructed can provide help for our data analysis. The generation of decision tree is a recursive process, It mainly uses the optimal partition attribute as the corresponding tree node, and then uses various values of the attribute to construct branches. In this way, until the data reaches a certain purity, the leaf nodes are obtained, and a decision tree in accordance with the rules is constructed. Among the traditional decision tree algorithms, C4.5 algorithm has a gain rate because of its attribute division. This leads to another obvious disadvantage, that is, it has a preference for the attributes with a small number of values, so that the accuracy of the decision tree is often not particularly ideal. In view of this, this paper proposes an improved E-C4.5 algorithm, which combines information gain and information gain rate to generate a new attribute partition criterion. The attribute partition method greatly eliminates the shortcoming of C4.5 algorithm which has a preference for the attributes with a small number of values, and further improves the decision accuracy of decision tree generation. In this paper, the actual data sets are used to verify the accuracy of the decision tree generated by the improved algorithm compared with the traditional C4.5 algorithm.
{"title":"Research on C4.5 Algorithm Optimization For User Churn","authors":"Chao Deng, Zhaohui Ma","doi":"10.1109/CSAIEE54046.2021.9543367","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543367","url":null,"abstract":"Decision tree is a kind of machine learning method which can decide a corresponding result according to the probability of different eigenvalues, The effective decision number constructed can provide help for our data analysis. The generation of decision tree is a recursive process, It mainly uses the optimal partition attribute as the corresponding tree node, and then uses various values of the attribute to construct branches. In this way, until the data reaches a certain purity, the leaf nodes are obtained, and a decision tree in accordance with the rules is constructed. Among the traditional decision tree algorithms, C4.5 algorithm has a gain rate because of its attribute division. This leads to another obvious disadvantage, that is, it has a preference for the attributes with a small number of values, so that the accuracy of the decision tree is often not particularly ideal. In view of this, this paper proposes an improved E-C4.5 algorithm, which combines information gain and information gain rate to generate a new attribute partition criterion. The attribute partition method greatly eliminates the shortcoming of C4.5 algorithm which has a preference for the attributes with a small number of values, and further improves the decision accuracy of decision tree generation. In this paper, the actual data sets are used to verify the accuracy of the decision tree generated by the improved algorithm compared with the traditional C4.5 algorithm.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116906979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543277
Yiying Zhang
With the development of internet, increasing people tends to consume online, which is convenient and timesaving. Taobao is the largest online shopping site in China. In this paper, we use the data from Taobao to predict the click-through-rate (CTR), an important metrics that measures the number of clicks advertisers receive on their ads per number of impressions. We introduce xDeepFM model to predict CTR and use Bayesian optimization to optimize. The xDeepFM model is a combined model, composed by DNN and CIN, like Wide&Deep. Based on it, the xDeepFM model enables to catch features of different dimensions: implicit high-order interactions and explicit high-order interactions. In addition, we use the Bayesian optimization to get the optimal hyperparameters. The metrics we used is Auc Roc, and the higher Auc-Roc is, the better performance the model will gain. The Auc Roc of our modle is 0.651 higher 0.031 and 0.012 respectively than LightGBM and DeepFM.
{"title":"CTR Prediction Model Using xDeepFM and Bayesian optimization","authors":"Yiying Zhang","doi":"10.1109/CSAIEE54046.2021.9543277","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543277","url":null,"abstract":"With the development of internet, increasing people tends to consume online, which is convenient and timesaving. Taobao is the largest online shopping site in China. In this paper, we use the data from Taobao to predict the click-through-rate (CTR), an important metrics that measures the number of clicks advertisers receive on their ads per number of impressions. We introduce xDeepFM model to predict CTR and use Bayesian optimization to optimize. The xDeepFM model is a combined model, composed by DNN and CIN, like Wide&Deep. Based on it, the xDeepFM model enables to catch features of different dimensions: implicit high-order interactions and explicit high-order interactions. In addition, we use the Bayesian optimization to get the optimal hyperparameters. The metrics we used is Auc Roc, and the higher Auc-Roc is, the better performance the model will gain. The Auc Roc of our modle is 0.651 higher 0.031 and 0.012 respectively than LightGBM and DeepFM.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127214159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543398
Yue Luo, Boyuan Yang
There are more and more videos appearing on the internet these years, new ways should be developed to recognize and manage them. Since video is composed of images, this work builds a CNN network to do video classification. The work uses the UCF 101 dataset, which contains 101 different categories, to train the model. Then a simple CNN network containing five layers is built with PyTorch and trained with UCF 101 dataset on GPU. The result shows that it's underfitting and its accuracy won't be improved much by changing parameters. However, adding more layers, including the dropout layer and batchnorm layer can greatly improve its accuracy. Then a C3D method is also applied to improve the accuracy. Finally, the highest accuracy reaches 69 percentage. In this work, a simple and effective way to recognize actions in a small video is developed to help people supervise and manage the video resources online.
{"title":"Video motions classification based on CNN","authors":"Yue Luo, Boyuan Yang","doi":"10.1109/CSAIEE54046.2021.9543398","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543398","url":null,"abstract":"There are more and more videos appearing on the internet these years, new ways should be developed to recognize and manage them. Since video is composed of images, this work builds a CNN network to do video classification. The work uses the UCF 101 dataset, which contains 101 different categories, to train the model. Then a simple CNN network containing five layers is built with PyTorch and trained with UCF 101 dataset on GPU. The result shows that it's underfitting and its accuracy won't be improved much by changing parameters. However, adding more layers, including the dropout layer and batchnorm layer can greatly improve its accuracy. Then a C3D method is also applied to improve the accuracy. Finally, the highest accuracy reaches 69 percentage. In this work, a simple and effective way to recognize actions in a small video is developed to help people supervise and manage the video resources online.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126343758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the image dehazing task, there are three key subtasks need to be performed. The first one is extracting the finer scale features, e.g. the detail textures of objects, covered by haze. The second one is retaining the coarser scale features, e.g. the contours of objects, as complete as possible. And third one is fusing the finer scale features and the coarser scale features together. Aiming at the three points, we propose a single image dehazing network named Res-Attention Net based on the encoding-decoding structure similar to U -Net. The encoder and decoder of Res-Attention Net are designed for the objective that extracting the detail textures and retrieving the contours at the same time. We construct the encoder of the Res-Attention Net by the residual blocks (RBs) with different depths and downsampling for performing the first two subtasks, i.e. extracting the multiscale image features from the original hazy image. The decoder of the Res-Attention is based on the attention gates (AGs) and upsampling. The decoder can retrieve the coarser scale features from the output of the encoder and can also fuse them with the multiscale features from the encoder together. That is to say, the decoder is for performing the last two subtasks. The experimental results show that the Res-Attention Net proposed performs better than several state-of-the-art methods.
{"title":"Res-Attention Net: An Image Dehazing Network","authors":"Shuai Song, Ren-Yuan Zhang, Zhipeng Qiu, Jiawei Jin, Shangbin Yu","doi":"10.1109/CSAIEE54046.2021.9543298","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543298","url":null,"abstract":"In the image dehazing task, there are three key subtasks need to be performed. The first one is extracting the finer scale features, e.g. the detail textures of objects, covered by haze. The second one is retaining the coarser scale features, e.g. the contours of objects, as complete as possible. And third one is fusing the finer scale features and the coarser scale features together. Aiming at the three points, we propose a single image dehazing network named Res-Attention Net based on the encoding-decoding structure similar to U -Net. The encoder and decoder of Res-Attention Net are designed for the objective that extracting the detail textures and retrieving the contours at the same time. We construct the encoder of the Res-Attention Net by the residual blocks (RBs) with different depths and downsampling for performing the first two subtasks, i.e. extracting the multiscale image features from the original hazy image. The decoder of the Res-Attention is based on the attention gates (AGs) and upsampling. The decoder can retrieve the coarser scale features from the output of the encoder and can also fuse them with the multiscale features from the encoder together. That is to say, the decoder is for performing the last two subtasks. The experimental results show that the Res-Attention Net proposed performs better than several state-of-the-art methods.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114273877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543185
Shitian He, H. Zou, Runlin Li, Xu Cao, Fei Cheng, Juan Wei
Most CNN-based detectors rely on high-quality images, and the detection performance may be damaged as the image quality decreases. To enhance ship detection for low-quality images, we propose a distillation network which utilize a “teacher” detector with high-quality input images to guide the training of the original “student” detector. In this way, the student detector can learn the advantage information from the teacher and thus achieves improved detection performance. We feed images with different sampling ratios to our network, and the experimental results on HRSC2016 dataset validate the effectiveness of our method. Moreover, we apply our method to different backbones, and the experimental results demonstrate the generality of our method.
{"title":"Teacher-Student Network for Low-quality Remote Sensing Ship Detection","authors":"Shitian He, H. Zou, Runlin Li, Xu Cao, Fei Cheng, Juan Wei","doi":"10.1109/CSAIEE54046.2021.9543185","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543185","url":null,"abstract":"Most CNN-based detectors rely on high-quality images, and the detection performance may be damaged as the image quality decreases. To enhance ship detection for low-quality images, we propose a distillation network which utilize a “teacher” detector with high-quality input images to guide the training of the original “student” detector. In this way, the student detector can learn the advantage information from the teacher and thus achieves improved detection performance. We feed images with different sampling ratios to our network, and the experimental results on HRSC2016 dataset validate the effectiveness of our method. Moreover, we apply our method to different backbones, and the experimental results demonstrate the generality of our method.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114323476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543424
Yunfei Quan
Improved technology has resulted in numerous computing devices. The more people spend time interacting with these computing devices, the more we become interested in coming up with new interaction methods that could help to facilitate application of active learning in decoding with the computing devices. Application of active learning in decoding technology will help achieve the goal of this journal as it helps to overcome eye limitations in speed and efficiency. Application of Active Learning Decoding for Python3 programming language was mainly created because it was flexible, easy to extend, and developed some other small models. The machine learning algorithm in python is often lying beneath the scikit-learn python library. The user can easily modify and build another new model to predict the model learning algorithm. The model can also design a good and quality novel model algorithm that can be used more easily.
{"title":"Application of Active Learning in Decoding","authors":"Yunfei Quan","doi":"10.1109/CSAIEE54046.2021.9543424","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543424","url":null,"abstract":"Improved technology has resulted in numerous computing devices. The more people spend time interacting with these computing devices, the more we become interested in coming up with new interaction methods that could help to facilitate application of active learning in decoding with the computing devices. Application of active learning in decoding technology will help achieve the goal of this journal as it helps to overcome eye limitations in speed and efficiency. Application of Active Learning Decoding for Python3 programming language was mainly created because it was flexible, easy to extend, and developed some other small models. The machine learning algorithm in python is often lying beneath the scikit-learn python library. The user can easily modify and build another new model to predict the model learning algorithm. The model can also design a good and quality novel model algorithm that can be used more easily.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130025680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543260
BoKai Wu
K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the same cluster is high, thus, the similarity of data in different clusters is low. K-means algorithm is a typical distance-based clustering algorithm. It takes distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater similarity. Clustering is also extremely extensive in practical applications, such as: market segmentation, social network analysis, organized computing clusters, and astronomical data analysis. This paper is my own attempt to make K-means code and API, using Python and Java to jointly complete a project. The Python is mainly used to write the framework of the core algorithm of K-means, and the Java to create experimental data. In this research report, I will describe the simple data model provided by K-means, as well as the design and implementation of K-means.
{"title":"K-means clustering algorithm and Python implementation","authors":"BoKai Wu","doi":"10.1109/CSAIEE54046.2021.9543260","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543260","url":null,"abstract":"K-means is a commonly used algorithm in machine learning. It is an unsupervised learning algorithm. It is regularly used for data clustering. Only the number of clusters are needed to be specified for it to automatically aggregate the data into multiple categories, the similarity between data in the same cluster is high, thus, the similarity of data in different clusters is low. K-means algorithm is a typical distance-based clustering algorithm. It takes distance as the evaluation index of similarity, that is, the closer the distance between two objects, the greater similarity. Clustering is also extremely extensive in practical applications, such as: market segmentation, social network analysis, organized computing clusters, and astronomical data analysis. This paper is my own attempt to make K-means code and API, using Python and Java to jointly complete a project. The Python is mainly used to write the framework of the core algorithm of K-means, and the Java to create experimental data. In this research report, I will describe the simple data model provided by K-means, as well as the design and implementation of K-means.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133035343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543180
Runze Guo, Shaojing Su, Zhen Zuo, Bei Sun
With the growing demand for marine environment supervision in China, surface target recognition has attracted more attention. To address the problems of complex water scenes with scale changes, much background information and inability to focus on key features, this paper proposes a multi-scale surface target recognition algorithm based on attention fusion mechanism. First, the network extracts different features from surface targets by multi-scale convolutional neural network. Then, discriminative features are enhanced by the fusion of channel attention module and spatial attention module. Finally, the feature representation of surface targets is formed by a joint loss function with localization loss and category loss. Tests are conducted on the VOC2007 dataset and the self-built surface target dataset, and the results show that the algorithm outperforms than other typical recognition on surface targets.
{"title":"A multi-scale surface target recognition algorithm based on attention fusion mechanism","authors":"Runze Guo, Shaojing Su, Zhen Zuo, Bei Sun","doi":"10.1109/CSAIEE54046.2021.9543180","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543180","url":null,"abstract":"With the growing demand for marine environment supervision in China, surface target recognition has attracted more attention. To address the problems of complex water scenes with scale changes, much background information and inability to focus on key features, this paper proposes a multi-scale surface target recognition algorithm based on attention fusion mechanism. First, the network extracts different features from surface targets by multi-scale convolutional neural network. Then, discriminative features are enhanced by the fusion of channel attention module and spatial attention module. Finally, the feature representation of surface targets is formed by a joint loss function with localization loss and category loss. Tests are conducted on the VOC2007 dataset and the self-built surface target dataset, and the results show that the algorithm outperforms than other typical recognition on surface targets.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127835354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543379
Tanqiu Jiang, Ziyu Xiong
Along with the rapid growth of wildfire events around the globe, the appeal to a better forest management strategy is becoming increasingly stronger recently. “Tree Delineation”, which refers to the process of identifying each individual tree from images, is a crucial element in the fields of forest management and remote sensing. Many efforts have been done to locate each individual tree in an image, but the vast majority of the researches were not based on the RGB images that are the most common and the most easily available at a large scale. In our study, we used RGB satellite images from Google Earth and attempted to identify each tree in the images with a rule-based methodology. Our method involves steps including recognizing vegetation, isolating trees, and locating local maxima. The result of our algorithm is comparable to labeling trees manually, and the robustness was confirmed by repeating the same approach on multiple images of different locations.
{"title":"Rule-Based Approach to the Automatic Detection of Individual Tree Crowns in RGB Satellite Images","authors":"Tanqiu Jiang, Ziyu Xiong","doi":"10.1109/CSAIEE54046.2021.9543379","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543379","url":null,"abstract":"Along with the rapid growth of wildfire events around the globe, the appeal to a better forest management strategy is becoming increasingly stronger recently. “Tree Delineation”, which refers to the process of identifying each individual tree from images, is a crucial element in the fields of forest management and remote sensing. Many efforts have been done to locate each individual tree in an image, but the vast majority of the researches were not based on the RGB images that are the most common and the most easily available at a large scale. In our study, we used RGB satellite images from Google Earth and attempted to identify each tree in the images with a rule-based methodology. Our method involves steps including recognizing vegetation, isolating trees, and locating local maxima. The result of our algorithm is comparable to labeling trees manually, and the robustness was confirmed by repeating the same approach on multiple images of different locations.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125701484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-20DOI: 10.1109/CSAIEE54046.2021.9543210
Liang Li, Min Hu, Fuji Ren, Haijun Xu
Over the years, energy time series forecasting has been widely studied and has played an important role in various fields, such as electric energy forecasting, solar energy forecasting, etc. In energy time series forecasting, it is crucial to building forecasting models for long series in order to obtain accurate forecasting results. Since the use of long series can cause the accuracy of the model to decrease. In this paper, we propose a deep learning model (TCNTA-BiGRU) based on a bi-directional gated cyclic unit (BiGRU) with a temporal attention mechanism to address the problem of accuracy degradation in long sequence tasks. First, in order to capture long-term dependencies, this paper divide the dataset and input it into a temporal convolutional network (TCN) to transform long sequences into multiple short sequences, which not only solves the problem that to cause gradient explosion or disappearance when processing long sequences, but also reduces the spatial complexity. Then, BiGRU is used to learn historical and future information and capture more short-term dependencies. Moreover, in order to enhance the model's ability to focus on data periodicity, a temporal attention mechanism is introduced. Additionally the autoregressive module is used to increase the linear fitting ability of the model. The model proposed in this paper is applied to the Electricity and Solar Energy datasets and the results show a better performance relate to existing deep learning models.
{"title":"Temporal Attention Based TCN-BIGRU Model for Energy Time Series Forecasting","authors":"Liang Li, Min Hu, Fuji Ren, Haijun Xu","doi":"10.1109/CSAIEE54046.2021.9543210","DOIUrl":"https://doi.org/10.1109/CSAIEE54046.2021.9543210","url":null,"abstract":"Over the years, energy time series forecasting has been widely studied and has played an important role in various fields, such as electric energy forecasting, solar energy forecasting, etc. In energy time series forecasting, it is crucial to building forecasting models for long series in order to obtain accurate forecasting results. Since the use of long series can cause the accuracy of the model to decrease. In this paper, we propose a deep learning model (TCNTA-BiGRU) based on a bi-directional gated cyclic unit (BiGRU) with a temporal attention mechanism to address the problem of accuracy degradation in long sequence tasks. First, in order to capture long-term dependencies, this paper divide the dataset and input it into a temporal convolutional network (TCN) to transform long sequences into multiple short sequences, which not only solves the problem that to cause gradient explosion or disappearance when processing long sequences, but also reduces the spatial complexity. Then, BiGRU is used to learn historical and future information and capture more short-term dependencies. Moreover, in order to enhance the model's ability to focus on data periodicity, a temporal attention mechanism is introduced. Additionally the autoregressive module is used to increase the linear fitting ability of the model. The model proposed in this paper is applied to the Electricity and Solar Energy datasets and the results show a better performance relate to existing deep learning models.","PeriodicalId":376014,"journal":{"name":"2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116228351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}