BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan
{"title":"基于多特征融合的微表情识别算法","authors":"BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan","doi":"10.1117/12.3014469","DOIUrl":null,"url":null,"abstract":"Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.","PeriodicalId":516634,"journal":{"name":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","volume":"12 6","pages":"1296908 - 1296908-10"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Microexpression recognition algorithm based on multi feature fusion\",\"authors\":\"BaiYang Xiang, BoKai Li, Huaijuan Zang, Zeliang Zhao, Shu Zhan\",\"doi\":\"10.1117/12.3014469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.\",\"PeriodicalId\":516634,\"journal\":{\"name\":\"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)\",\"volume\":\"12 6\",\"pages\":\"1296908 - 1296908-10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.3014469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Algorithm, Imaging Processing and Machine Vision (AIPMV 2023)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.3014469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
视频面部微表情识别因其持续时间短、动作幅度小而难以提取特征。为了更好地结合视频的时空信息,整个模型分为局部注意模块、全局注意模块和时间模块。首先,局部注意模块截取关键区域,经过处理后发送到通道注意网络;然后,全局注意模块随机擦除关键区域后,将数据发送到空间注意网络;最后,时序模块将微表情发生帧经过处理后发送到时移模块和空间注意网络;最后,通过三个全连接层进行特征融合后得到分类结果。实验基于 CASMEⅡ 数据集进行测试,经过五倍交叉验证后,平均准确率为 76.15,非加权 F1 值为 0.691。
Microexpression recognition algorithm based on multi feature fusion
Video facial micro expression recognition is difficult to extract features due to its short duration and small action amplitude. In order to better combine temporal and spatial information of video, the whole model is divided into local attention module, global attention module and temporal module. First, the local attention module intercepts the key areas and sends them to the network with channel attention after processing; Then the global attention module sends the data into the network with spatial attention after random erasure avoiding key areas; Finally, the temporal module sends the micro expression occurrence frame to the network with temporal shift module and spatial attention after processing; Finally, the classification results are obtained through three full connection layers after feature fusion. The experiment is tested based on CASMEⅡ dataset,After five-fold Cross Validation, the average accuracy rate is 76.15, the unweighted F1 value is 0.691.Compared with the mainstream algorithm, this method has improvement.