Zichen Liang, Haiying Xia, Yumei Tan, Shuxiang Song
{"title":"Hard semantic mask strategy for automatic facial action unit recognition with teacher–student model","authors":"Zichen Liang, Haiying Xia, Yumei Tan, Shuxiang Song","doi":"10.1007/s00530-024-01385-x","DOIUrl":null,"url":null,"abstract":"<p>Facial Action Coding System (FACS) is a widely used technique in affective computing, which defines a series of facial action units (AUs) corresponding to localized regions of the face. Fine-grained feature information of critical regions is crucial for accurate AU recognition. However, conventional random masking techniques used in Masked Image Modeling (MIM) often overlook the inherent symmetry of faces and the complex interrelationships among facial muscles, leading to a lack of critical local details and poor AU recognition performance. To address these limitations, we propose a novel teacher-student model-based MIM framework called Hard Semantic Masking Strategy Teacher–Student (HSMS-TS). Specifically, we first introduce a hard semantic mask strategy in the teacher model, aims to guide the student network to focus on learning fine-grained AU-related representations. Then, the student network utilizes the attention maps from the pretrained teacher model to generate a more challenging masking method from a predefined template, increasing the learning difficulty and helping the student acquire better AU-related representations. The experimental results on two publicly available datasets, i.e., BP4D and DISFA, show the effectiveness of our proposed method with exceptional performance. Code will be publicly available at http://github.com/lzichen/HSMS-TS.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01385-x","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Facial Action Coding System (FACS) is a widely used technique in affective computing, which defines a series of facial action units (AUs) corresponding to localized regions of the face. Fine-grained feature information of critical regions is crucial for accurate AU recognition. However, conventional random masking techniques used in Masked Image Modeling (MIM) often overlook the inherent symmetry of faces and the complex interrelationships among facial muscles, leading to a lack of critical local details and poor AU recognition performance. To address these limitations, we propose a novel teacher-student model-based MIM framework called Hard Semantic Masking Strategy Teacher–Student (HSMS-TS). Specifically, we first introduce a hard semantic mask strategy in the teacher model, aims to guide the student network to focus on learning fine-grained AU-related representations. Then, the student network utilizes the attention maps from the pretrained teacher model to generate a more challenging masking method from a predefined template, increasing the learning difficulty and helping the student acquire better AU-related representations. The experimental results on two publicly available datasets, i.e., BP4D and DISFA, show the effectiveness of our proposed method with exceptional performance. Code will be publicly available at http://github.com/lzichen/HSMS-TS.