Mohamed Gamal M. Kamaleldin, S. Abu-Bakar, U. U. Sheikh
{"title":"Stochastic recursive gradient descent optimization-based on foreground features of Fisher vector","authors":"Mohamed Gamal M. Kamaleldin, S. Abu-Bakar, U. U. Sheikh","doi":"10.1117/12.2644640","DOIUrl":null,"url":null,"abstract":"Human action recognition has been one of the hot topics in computer vision both from the handcrafted and deep learning approaches. In the handcrafted approach, the extracted features are encoded for reducing the size of these features. Amonsgt the state-of-the-art approaches is to encode these visual features using the Gaussian mixture model. However, the size of the codebook is an issue in terms of the computation complexity, especially for large-scale data as it requires encoding using a large codebook. In this paper, we introduced the use of different optimizers to reduce the codebook size while boosting its accuracy. To illustrate the performance , first we use the improved dense trajectories (IDT) to extract the handcrafted features. This is followed with encoding the descriptor using Fisher kernel-based codebook using the Gaussian mixture model. Next, the support vector machine is used to classify the categories. We then use and compare five different Stochastic gradient descent optimization techniques to modify the number of Gaussian components. In this manner we are able to select the discriminative foreground features (as represented by the final number of Gaussian components), and omit the background features. Finally, to show the performance improvement of the proposed method, we implement this technique to two datasets UCF101 and HMDB51.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Digital Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2644640","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Human action recognition has been one of the hot topics in computer vision both from the handcrafted and deep learning approaches. In the handcrafted approach, the extracted features are encoded for reducing the size of these features. Amonsgt the state-of-the-art approaches is to encode these visual features using the Gaussian mixture model. However, the size of the codebook is an issue in terms of the computation complexity, especially for large-scale data as it requires encoding using a large codebook. In this paper, we introduced the use of different optimizers to reduce the codebook size while boosting its accuracy. To illustrate the performance , first we use the improved dense trajectories (IDT) to extract the handcrafted features. This is followed with encoding the descriptor using Fisher kernel-based codebook using the Gaussian mixture model. Next, the support vector machine is used to classify the categories. We then use and compare five different Stochastic gradient descent optimization techniques to modify the number of Gaussian components. In this manner we are able to select the discriminative foreground features (as represented by the final number of Gaussian components), and omit the background features. Finally, to show the performance improvement of the proposed method, we implement this technique to two datasets UCF101 and HMDB51.