{"title":"Adaptive spatio-temporal attention neural network for crossdatabase micro-expression recognition","authors":"Yuhan RAN","doi":"10.1016/j.vrih.2022.03.006","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recognition (CDMER) has emerged as a significant challenge in micro-expression recognition and analysis. Because the training and testing data in CDMER come from different micro-expression databases, CDMER is more challenging than conventional micro-expression recognition.</p></div><div><h3>Methods</h3><p>In this paper, an adaptive spatio-temporal attention neural network (ASTANN) using an attention mechanism is presented to address this challenge. To this end, the micro-expression databases SMIC and CASME II are first preprocessed using an optical flow approach, which extracts motion information among video frames that represent discriminative features of micro-expression. After preprocessing, a novel adaptive framework with a spatiotemporal attention module was designed to assign spatial and temporal weights to enhance the most discriminative features. The deep neural network then extracts the cross-domain feature, in which the second-order statistics of the sample features in the source domain are aligned with those in the target domain by minimizing the correlation alignment (CORAL) loss such that the source and target databases share similar distributions.</p></div><div><h3>Results</h3><p>To evaluate the performance of ASTANN, experiments were conducted based on the SMIC and CASME II databases under the standard experimental evaluation protocol of CDMER. The experimental results demonstrate that ASTANN outperformed other methods in relevant crossdatabase tasks.</p></div><div><h3>Conclusions</h3><p>Extensive experiments were conducted on benchmark tasks, and the results show that ASTANN has superior performance compared with other approaches. This demonstrates the superiority of our method in solving the CDMER problem.</p></div>","PeriodicalId":33538,"journal":{"name":"Virtual Reality Intelligent Hardware","volume":"5 2","pages":"Pages 142-156"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Virtual Reality Intelligent Hardware","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2096579622000316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1
Abstract
Background
The use of micro-expression recognition to recognize human emotions is one of the most critical challenges in human-computer interaction applications. In recent years, cross-database micro-expression recognition (CDMER) has emerged as a significant challenge in micro-expression recognition and analysis. Because the training and testing data in CDMER come from different micro-expression databases, CDMER is more challenging than conventional micro-expression recognition.
Methods
In this paper, an adaptive spatio-temporal attention neural network (ASTANN) using an attention mechanism is presented to address this challenge. To this end, the micro-expression databases SMIC and CASME II are first preprocessed using an optical flow approach, which extracts motion information among video frames that represent discriminative features of micro-expression. After preprocessing, a novel adaptive framework with a spatiotemporal attention module was designed to assign spatial and temporal weights to enhance the most discriminative features. The deep neural network then extracts the cross-domain feature, in which the second-order statistics of the sample features in the source domain are aligned with those in the target domain by minimizing the correlation alignment (CORAL) loss such that the source and target databases share similar distributions.
Results
To evaluate the performance of ASTANN, experiments were conducted based on the SMIC and CASME II databases under the standard experimental evaluation protocol of CDMER. The experimental results demonstrate that ASTANN outperformed other methods in relevant crossdatabase tasks.
Conclusions
Extensive experiments were conducted on benchmark tasks, and the results show that ASTANN has superior performance compared with other approaches. This demonstrates the superiority of our method in solving the CDMER problem.