{"title":"基于深度度量学习的多路卷积网络视频分类方法","authors":"Xiaoxia Luo, Bei B. Zhou","doi":"10.12783/dtetr/mcaee2020/35028","DOIUrl":null,"url":null,"abstract":"Aiming at the significant impact of video semantic changes on video classification results, in the video classification process, witch in includes the large intra-class dispersion and inter-class similarity during video, this paper proposes a multi-way convolutional network video classification method based on deep metric learning. The method includes a 3D network-based multi-way convolutional network and a metric learning method based on the allocation of negative sample intervals. The network is mainly divided into three parts: segmented video feature extraction, similarity measurement based on deep metric learning, and classification. Firstly, the multi-channel convolutional network can extract the features of different periods of the video, and obtain the depth features of the video through feature fusion. Secondly, by calculating the error based on the interval function of the average semantic distance of negative samples and backpropagating, the network can learn the difference in semantic distance between samples. Finally, the network combines classification tasks with metric learning during the training process to make the network classification results better. Experiments on the data set UCF101, compared with existing methods, the multi-way convolutional network video classification method based on deep metric learning can effectively improve the accuracy of video classification.","PeriodicalId":11264,"journal":{"name":"DEStech Transactions on Engineering and Technology Research","volume":"70 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video Classification Method of Multi-Way Convolutional Network Based on Deep Metric Learning\",\"authors\":\"Xiaoxia Luo, Bei B. Zhou\",\"doi\":\"10.12783/dtetr/mcaee2020/35028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the significant impact of video semantic changes on video classification results, in the video classification process, witch in includes the large intra-class dispersion and inter-class similarity during video, this paper proposes a multi-way convolutional network video classification method based on deep metric learning. The method includes a 3D network-based multi-way convolutional network and a metric learning method based on the allocation of negative sample intervals. The network is mainly divided into three parts: segmented video feature extraction, similarity measurement based on deep metric learning, and classification. Firstly, the multi-channel convolutional network can extract the features of different periods of the video, and obtain the depth features of the video through feature fusion. Secondly, by calculating the error based on the interval function of the average semantic distance of negative samples and backpropagating, the network can learn the difference in semantic distance between samples. Finally, the network combines classification tasks with metric learning during the training process to make the network classification results better. Experiments on the data set UCF101, compared with existing methods, the multi-way convolutional network video classification method based on deep metric learning can effectively improve the accuracy of video classification.\",\"PeriodicalId\":11264,\"journal\":{\"name\":\"DEStech Transactions on Engineering and Technology Research\",\"volume\":\"70 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DEStech Transactions on Engineering and Technology Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12783/dtetr/mcaee2020/35028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DEStech Transactions on Engineering and Technology Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12783/dtetr/mcaee2020/35028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Video Classification Method of Multi-Way Convolutional Network Based on Deep Metric Learning
Aiming at the significant impact of video semantic changes on video classification results, in the video classification process, witch in includes the large intra-class dispersion and inter-class similarity during video, this paper proposes a multi-way convolutional network video classification method based on deep metric learning. The method includes a 3D network-based multi-way convolutional network and a metric learning method based on the allocation of negative sample intervals. The network is mainly divided into three parts: segmented video feature extraction, similarity measurement based on deep metric learning, and classification. Firstly, the multi-channel convolutional network can extract the features of different periods of the video, and obtain the depth features of the video through feature fusion. Secondly, by calculating the error based on the interval function of the average semantic distance of negative samples and backpropagating, the network can learn the difference in semantic distance between samples. Finally, the network combines classification tasks with metric learning during the training process to make the network classification results better. Experiments on the data set UCF101, compared with existing methods, the multi-way convolutional network video classification method based on deep metric learning can effectively improve the accuracy of video classification.