Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137884
Keying Huang, Rui Bai, Jin Ji, Jun Zhao, Wen-ning Yan
As the power system of an aircraft, accurate prediction of the remaining useful life (RUL) of an aero-engine is of great importance to ensure the flight safety of the aircraft. However, existing methods are all data-driven-based, and such methods are extremely demanding in terms of data volume. To address the problem of insufficient engine data, this paper proposes a similarity-based method for predicting the life of small-sample aircraft engines. Firstly, the KPCA method is used to model the engine degradation trajectory, then a simple and effective method is proposed to determine the degradation start moment of each engine, and finally the similarity between each training sample and the test sample is determined based on the trained KPCA model, and then the remaining life of the test sample is estimated. Experiments show that the method proposed in this paper is effective in predicting the remaining life of an engine under the condition of small samples.
{"title":"A Similarity-Based Remaining Useful Life Prediction Method for Aero Engines with Small Smples","authors":"Keying Huang, Rui Bai, Jin Ji, Jun Zhao, Wen-ning Yan","doi":"10.1109/ACAIT56212.2022.10137884","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137884","url":null,"abstract":"As the power system of an aircraft, accurate prediction of the remaining useful life (RUL) of an aero-engine is of great importance to ensure the flight safety of the aircraft. However, existing methods are all data-driven-based, and such methods are extremely demanding in terms of data volume. To address the problem of insufficient engine data, this paper proposes a similarity-based method for predicting the life of small-sample aircraft engines. Firstly, the KPCA method is used to model the engine degradation trajectory, then a simple and effective method is proposed to determine the degradation start moment of each engine, and finally the similarity between each training sample and the test sample is determined based on the trained KPCA model, and then the remaining life of the test sample is estimated. Experiments show that the method proposed in this paper is effective in predicting the remaining life of an engine under the condition of small samples.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123922342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137846
Yukun Huang
In order to improve the intrusion detection ability of multi-dimensional node combination mixed topology network, this paper proposes an intrusion detection method based on naive Bayes algorithm. Build a distributed structure model of intrusion data in the network, and conduct traffic statistics and feature analysis on the network through low-speed monitoring and combined frequency scanning, so as to extract abnormal traffic label features of data in the network. Then, according to the types of attacks, Detect the fuzzy clustering center of intrusion data. The fusion model of anomaly feature distribution of intrusion traffic sequence is established based on the clustering results. Based on this, detect the redundancy and correlation of intrusion information, then analyze the fuzzy weight analysis of intrusion traffic sequence, and complete adaptive learning. Finally, control the attack data, so as to achieve the extraction and detection of intrusion information features. The test results show that the intrusion data detection results obtained by this method have high accuracy, so it has good detection performance and strong anti-interference ability, which can be used to improve the network security and anti attack ability.
{"title":"Network Intrusion Detection Method Based on Naive Bayes Algorithm","authors":"Yukun Huang","doi":"10.1109/ACAIT56212.2022.10137846","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137846","url":null,"abstract":"In order to improve the intrusion detection ability of multi-dimensional node combination mixed topology network, this paper proposes an intrusion detection method based on naive Bayes algorithm. Build a distributed structure model of intrusion data in the network, and conduct traffic statistics and feature analysis on the network through low-speed monitoring and combined frequency scanning, so as to extract abnormal traffic label features of data in the network. Then, according to the types of attacks, Detect the fuzzy clustering center of intrusion data. The fusion model of anomaly feature distribution of intrusion traffic sequence is established based on the clustering results. Based on this, detect the redundancy and correlation of intrusion information, then analyze the fuzzy weight analysis of intrusion traffic sequence, and complete adaptive learning. Finally, control the attack data, so as to achieve the extraction and detection of intrusion information features. The test results show that the intrusion data detection results obtained by this method have high accuracy, so it has good detection performance and strong anti-interference ability, which can be used to improve the network security and anti attack ability.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129659428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dehazing refers to a method that aims to remove the interference of haze in the image to obtain a high-quality image by some certain ways such as statistical knowledge, image restoration knowledge and deep learning knowledge. Some classical methods have been proposed for removing the haze and achieved some most pleasant performance. However, there is some aliasing phenomena in dehazing results. To address this issue, we propose an effective image dehazing using spatial and channel aware network(IDSCAN) to learn some features with strong representation ability from the images with free-haze. For spatial aware, we extract them by combining some convolutional information with some simple operations such as unfold and reshape. For channel aware, we compute the weight of each channel by the compression in the frequency domain which is implemented by the discrete cosine transform block network (DCTB). Extensive experimental results on the RESIDE haze dataset show that our method outperforms other state-of-art dehazing methods in terms of qualitative and quantitative methods. Simultaneously, we also effective improve the aliasing phenomena of images removed the haze.
{"title":"IDSCAN:Image Dehazing Using Spatial and Channel Aware Network","authors":"Ruxi Xiang, Qingquan Xu, Xifang Zhu, Longan Zhang, Feng Wu","doi":"10.1109/ACAIT56212.2022.10137817","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137817","url":null,"abstract":"Dehazing refers to a method that aims to remove the interference of haze in the image to obtain a high-quality image by some certain ways such as statistical knowledge, image restoration knowledge and deep learning knowledge. Some classical methods have been proposed for removing the haze and achieved some most pleasant performance. However, there is some aliasing phenomena in dehazing results. To address this issue, we propose an effective image dehazing using spatial and channel aware network(IDSCAN) to learn some features with strong representation ability from the images with free-haze. For spatial aware, we extract them by combining some convolutional information with some simple operations such as unfold and reshape. For channel aware, we compute the weight of each channel by the compression in the frequency domain which is implemented by the discrete cosine transform block network (DCTB). Extensive experimental results on the RESIDE haze dataset show that our method outperforms other state-of-art dehazing methods in terms of qualitative and quantitative methods. Simultaneously, we also effective improve the aliasing phenomena of images removed the haze.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130202105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137838
Jianhong Zou, Yihui Cui, Ting Zhao, Weihua Ouyang, Bei Luo, Qilie Liu
In the autonomous driving system, accurate scene perception and trajectory prediction are critical for collision avoidance and path planning of autonomous vehicles. This paper proposes a scene perception and trajectory prediction method based on graph attention mechanism to learn semantic and interaction information based on bird eye’s view (BEV) map. The method includes spatiotemporal pyramid network and graph attention network. The former uses spatiotemporal pyramid network to model the surrounding information to obtain scene features, and graph attention network models the interaction information of the surrounding traffic participants to obtain graph interactive features. Then, scene semantic features and graph interaction features are fused into a unified feature space to perform downstream pixel-level classification and trajectory prediction tasks. Compared with baseline method, the proposed method significantly improves the average classification accuracy and reduces the average error of trajectory prediction with high efficiency. Experimental results show that the proposed method has better performance and is more feasible for deployment in real-world automatic driving scenarios.
{"title":"Spatiotemporal Pyramid Aggregation and Graph Attention for Scene Perception and Tajectory Prediction","authors":"Jianhong Zou, Yihui Cui, Ting Zhao, Weihua Ouyang, Bei Luo, Qilie Liu","doi":"10.1109/ACAIT56212.2022.10137838","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137838","url":null,"abstract":"In the autonomous driving system, accurate scene perception and trajectory prediction are critical for collision avoidance and path planning of autonomous vehicles. This paper proposes a scene perception and trajectory prediction method based on graph attention mechanism to learn semantic and interaction information based on bird eye’s view (BEV) map. The method includes spatiotemporal pyramid network and graph attention network. The former uses spatiotemporal pyramid network to model the surrounding information to obtain scene features, and graph attention network models the interaction information of the surrounding traffic participants to obtain graph interactive features. Then, scene semantic features and graph interaction features are fused into a unified feature space to perform downstream pixel-level classification and trajectory prediction tasks. Compared with baseline method, the proposed method significantly improves the average classification accuracy and reduces the average error of trajectory prediction with high efficiency. Experimental results show that the proposed method has better performance and is more feasible for deployment in real-world automatic driving scenarios.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128820039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137978
Yaxian Liu, Hao Fang, Hua Yu
Traditional face classification algorithm has low accuracy for gender classification. Combined with the characteristics of deep feature extraction of convolutional neural network in deep learning, a face intelligent classification model based on Inception-ResNet network and estimated LogistiC regression model is constructed by stacking generalization integration method. In this model, Inception-ResNet network is adopted as level 0 learner, and binomial Logistic regression model is used as levell learners. In this way, deep learning and intelligent classification of face images are carried out. Experimental results show that the gender classification prediction accuracy of the proposed Inception-ResNet network is as high as 97.45 ± 0.78, which is higher than that of single VGG16 and ResNet50 network models. Compared with the other two face intelligent classification algorithms, the classification accuracy of the proposed algorithm is 5.52% and 4.69% higher than that of the other two algorithms, respectively. Therefore, the proposed algorithm can achieve accurate gender classification through face recognition, and the classification accuracy is high, which can further accelerate the application of intelligent technology.
{"title":"Research on Intelligent Classification Algorithm of Human Faces Based on Deep Learning","authors":"Yaxian Liu, Hao Fang, Hua Yu","doi":"10.1109/ACAIT56212.2022.10137978","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137978","url":null,"abstract":"Traditional face classification algorithm has low accuracy for gender classification. Combined with the characteristics of deep feature extraction of convolutional neural network in deep learning, a face intelligent classification model based on Inception-ResNet network and estimated LogistiC regression model is constructed by stacking generalization integration method. In this model, Inception-ResNet network is adopted as level 0 learner, and binomial Logistic regression model is used as levell learners. In this way, deep learning and intelligent classification of face images are carried out. Experimental results show that the gender classification prediction accuracy of the proposed Inception-ResNet network is as high as 97.45 ± 0.78, which is higher than that of single VGG16 and ResNet50 network models. Compared with the other two face intelligent classification algorithms, the classification accuracy of the proposed algorithm is 5.52% and 4.69% higher than that of the other two algorithms, respectively. Therefore, the proposed algorithm can achieve accurate gender classification through face recognition, and the classification accuracy is high, which can further accelerate the application of intelligent technology.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"237 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121629465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137969
Guilu Wang
Fatigue driving detection based on YOLOV5 object detection algorithm. YOLOV5N with fewer parameters is selected as the basic model, and the large object detection layer in YOLOV5N is removed according to the object size clustering results, which reduces the parameters and improves the detection results. SAM is introduced to improve the ability of the backbone network to extract key features, and the convolution kernel in SAM is expanded to provide a wider receptive field for the model, in exchange for better detection results with a small increase in parameters. Referring to BiFPN, the Neck part of YOLOV5N is modified to provide more diverse fusion methods for multi-scale features. The precision, recall and mAP of the improved model are higher than those of YOLOV5N.
{"title":"Fatigue Driving Detection Based on Improved YOLOV5","authors":"Guilu Wang","doi":"10.1109/ACAIT56212.2022.10137969","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137969","url":null,"abstract":"Fatigue driving detection based on YOLOV5 object detection algorithm. YOLOV5N with fewer parameters is selected as the basic model, and the large object detection layer in YOLOV5N is removed according to the object size clustering results, which reduces the parameters and improves the detection results. SAM is introduced to improve the ability of the backbone network to extract key features, and the convolution kernel in SAM is expanded to provide a wider receptive field for the model, in exchange for better detection results with a small increase in parameters. Referring to BiFPN, the Neck part of YOLOV5N is modified to provide more diverse fusion methods for multi-scale features. The precision, recall and mAP of the improved model are higher than those of YOLOV5N.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121158837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137941
Baofu Fang, Shuai Zhou, Hao Wang
Most of the existing slam algorithms are designed based on the assumption of a static environment, this strong assumption limits the practical application of most slam systems. The main reason is that moving objects will cause feature mismatch in the pose estimation process, which in turn affects the accuracy of localization and mapping. In this paper, we propose a SLAM algorithm in a dynamic environment. First, we use the BlendMask network to detect potential moving objects to generate masks for dynamic objects. The geometrically constrained joint optical flow method is used to detect dynamic feature points. Secondly, aiming at the failure of semantic segmentation network segmentation, a missed detection compensation algorithm based on the invariance of adjacent frame speed is proposed. Finally, a keyframe selection strategy is proposed to construct a semantic octree graph containing only static objects. We evaluate our algorithm on TUM RGB-D and real scene datasets. The experimental results show that the algorithm has high accuracy and real-time performance.
{"title":"Semantic SLAM Based on Compensated Segmentation and Geometric Constraints in Dynamic Environments","authors":"Baofu Fang, Shuai Zhou, Hao Wang","doi":"10.1109/ACAIT56212.2022.10137941","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137941","url":null,"abstract":"Most of the existing slam algorithms are designed based on the assumption of a static environment, this strong assumption limits the practical application of most slam systems. The main reason is that moving objects will cause feature mismatch in the pose estimation process, which in turn affects the accuracy of localization and mapping. In this paper, we propose a SLAM algorithm in a dynamic environment. First, we use the BlendMask network to detect potential moving objects to generate masks for dynamic objects. The geometrically constrained joint optical flow method is used to detect dynamic feature points. Secondly, aiming at the failure of semantic segmentation network segmentation, a missed detection compensation algorithm based on the invariance of adjacent frame speed is proposed. Finally, a keyframe selection strategy is proposed to construct a semantic octree graph containing only static objects. We evaluate our algorithm on TUM RGB-D and real scene datasets. The experimental results show that the algorithm has high accuracy and real-time performance.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125332183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137972
Xing Zhou, Yaping Wan
Causal relation is the cornerstone of human understanding and exploration of the world. Inferring causal relations between things has been of interest to researchers. Most traditional methods are designed purely for discrete or continuous data, yet mixed data are widely available. This paper proposes a causal discovery method based on a hybrid structural equation model. The main idea is to formulate a nonlinear causal mechanism for mixed data through a hybrid structural equation model, while incorporating the ideas of structural equation and probabilistic noise in likelihood maximization, which realizes efficient causal inference on mixed data. Experimental results on synthetic and real-world datasets show that the method improves the accuracy of causal inference for mixed data and it’s robust to anomalous data.
{"title":"Causal Discovery Based on Hybrid Structural Equation Model","authors":"Xing Zhou, Yaping Wan","doi":"10.1109/ACAIT56212.2022.10137972","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137972","url":null,"abstract":"Causal relation is the cornerstone of human understanding and exploration of the world. Inferring causal relations between things has been of interest to researchers. Most traditional methods are designed purely for discrete or continuous data, yet mixed data are widely available. This paper proposes a causal discovery method based on a hybrid structural equation model. The main idea is to formulate a nonlinear causal mechanism for mixed data through a hybrid structural equation model, while incorporating the ideas of structural equation and probabilistic noise in likelihood maximization, which realizes efficient causal inference on mixed data. Experimental results on synthetic and real-world datasets show that the method improves the accuracy of causal inference for mixed data and it’s robust to anomalous data.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126721617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137933
Shahela Saif, Samabia Tehseen
Face analysis is one of the key research areas in the field of computer vision with applications in numerous areas. Face recognition, emotion recognition, and more recently deepfake detection have greatly benefited from the advancements in the field of face analysis. Our research attempts to identify useful facial features for analysis. We first analyze the effectiveness of geometric facial features for the purpose of emotion recognition. In later experiments, a fusion scheme was created based on the preliminary analysis,which tested the performance of these selected features for the identification of real and fake images. We include local image features in combination with geometric facial features to measure their effectiveness in fake image detection tasks. The promising results produced in this study can be used to perform a more in-depth analysis of face geometry and its result in facial analysis.
{"title":"Evaluating Effectiveness of Using Multi-Features to Differentiate Real from Fake Facial Images","authors":"Shahela Saif, Samabia Tehseen","doi":"10.1109/ACAIT56212.2022.10137933","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137933","url":null,"abstract":"Face analysis is one of the key research areas in the field of computer vision with applications in numerous areas. Face recognition, emotion recognition, and more recently deepfake detection have greatly benefited from the advancements in the field of face analysis. Our research attempts to identify useful facial features for analysis. We first analyze the effectiveness of geometric facial features for the purpose of emotion recognition. In later experiments, a fusion scheme was created based on the preliminary analysis,which tested the performance of these selected features for the identification of real and fake images. We include local image features in combination with geometric facial features to measure their effectiveness in fake image detection tasks. The promising results produced in this study can be used to perform a more in-depth analysis of face geometry and its result in facial analysis.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127768635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-09DOI: 10.1109/ACAIT56212.2022.10137999
Ruoyu Lou, Wu Yang, Yingjiang Li, Ling Lu
Aiming at the problem that the target tracking method of deep learning has a large number of model parameters and insufficient real-time performance, it is difficult to apply to mobile terminals or embedded devices with insufficient computing power. A lightweight hybrid attention-based twin network tracking algorithm is proposed. Firstly, based on MobileNetv3-Large network, group convolution and channel rearrangement are performed; then, in view of the problem that traditional attention mechanism only considers a single scope, this paper proposes a lightweight group-gated mixed attention (Group-gated mixed attention, GG); finally, GG is embedded in the Siamese network structure of this paper and the hierarchical feature fusion strategy is used to improve the tracking accuracy. Experiments show that the parameters of the proposed GG decrease by 26.2% compared with CBAM, decrease by 6.50% compared with SE, and increase Top-1 by 2.59% and 2.68% respectively; the experiments on the OTB100 and VOT2018 datasets demonstrate that the proposed algorithm is comparable to traditional tracking Compared with the algorithm, the accuracy and real-time performance have great advantages.
{"title":"Object Tracking Method Combined with Lightweight Hybrid Attention Siamese Network","authors":"Ruoyu Lou, Wu Yang, Yingjiang Li, Ling Lu","doi":"10.1109/ACAIT56212.2022.10137999","DOIUrl":"https://doi.org/10.1109/ACAIT56212.2022.10137999","url":null,"abstract":"Aiming at the problem that the target tracking method of deep learning has a large number of model parameters and insufficient real-time performance, it is difficult to apply to mobile terminals or embedded devices with insufficient computing power. A lightweight hybrid attention-based twin network tracking algorithm is proposed. Firstly, based on MobileNetv3-Large network, group convolution and channel rearrangement are performed; then, in view of the problem that traditional attention mechanism only considers a single scope, this paper proposes a lightweight group-gated mixed attention (Group-gated mixed attention, GG); finally, GG is embedded in the Siamese network structure of this paper and the hierarchical feature fusion strategy is used to improve the tracking accuracy. Experiments show that the parameters of the proposed GG decrease by 26.2% compared with CBAM, decrease by 6.50% compared with SE, and increase Top-1 by 2.59% and 2.68% respectively; the experiments on the OTB100 and VOT2018 datasets demonstrate that the proposed algorithm is comparable to traditional tracking Compared with the algorithm, the accuracy and real-time performance have great advantages.","PeriodicalId":398228,"journal":{"name":"2022 6th Asian Conference on Artificial Intelligence Technology (ACAIT)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133565316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}