Facial Action Unit Recognition Enhanced by Text Descriptions of FACS

IF 9.8 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Affective Computing Pub Date : 2024-09-30 DOI:10.1109/TAFFC.2024.3470524

Yanan Chang;Caichao Zhang;Yi Wu;Shangfei Wang

{"title":"Facial Action Unit Recognition Enhanced by Text Descriptions of FACS","authors":"Yanan Chang;Caichao Zhang;Yi Wu;Shangfei Wang","doi":"10.1109/TAFFC.2024.3470524","DOIUrl":null,"url":null,"abstract":"Although the descriptions of facial action units (AUs) provide crucial semantic knowledge for representation learning from facial images, they have not been fully explored for facial action unit recognition. In this paper, we propose a method that effectively explores the knowledge existing in AU descriptions to enhance AU recognition. Specifically, the proposed method consists of three components, i.e., AU recognition network, global representation alignment, and AU representation alignment. The AU recognition network extracts global features and AU-specific features for AU prediction from images. To leverage AU textual descriptions fully, we design two-level representation alignment for AU recognition. The global representation alignment component closes the distance between the global facial features and its corresponding positive global embedding extracted from textual descriptions. Then, the AU-specific features are aligned with the positive AU textual embedding by the AU representation alignment component. Negative textual embedding generation strategies are also designed to further boost the two-level representation alignment. Through the two-level alignment, AU textual descriptions guide image representation learning of the AU recognition network. Experiments on two benchmark datasets and one in-the-wild dataset demonstrate the efficacy of the description-enhanced AU recognition method, compared with the state-of-the-art works.","PeriodicalId":13131,"journal":{"name":"IEEE Transactions on Affective Computing","volume":"16 2","pages":"814-826"},"PeriodicalIF":9.8000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Affective Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10699431/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Although the descriptions of facial action units (AUs) provide crucial semantic knowledge for representation learning from facial images, they have not been fully explored for facial action unit recognition. In this paper, we propose a method that effectively explores the knowledge existing in AU descriptions to enhance AU recognition. Specifically, the proposed method consists of three components, i.e., AU recognition network, global representation alignment, and AU representation alignment. The AU recognition network extracts global features and AU-specific features for AU prediction from images. To leverage AU textual descriptions fully, we design two-level representation alignment for AU recognition. The global representation alignment component closes the distance between the global facial features and its corresponding positive global embedding extracted from textual descriptions. Then, the AU-specific features are aligned with the positive AU textual embedding by the AU representation alignment component. Negative textual embedding generation strategies are also designed to further boost the two-level representation alignment. Through the two-level alignment, AU textual descriptions guide image representation learning of the AU recognition network. Experiments on two benchmark datasets and one in-the-wild dataset demonstrate the efficacy of the description-enhanced AU recognition method, compared with the state-of-the-art works.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过 FACS 文本描述增强面部动作单元识别能力

尽管面部动作单元的描述为面部图像的表示学习提供了重要的语义知识，但在面部动作单元识别方面还没有得到充分的探索。本文提出了一种有效挖掘非目标描述中已有知识的方法来增强非目标识别。具体而言，该方法由三个部分组成，即AU识别网络、全局表示对齐和AU表示对齐。AU识别网络从图像中提取全局特征和特定于AU的特征，用于AU预测。为了充分利用AU文本描述，我们为AU识别设计了两级表示对齐。全局表示对齐组件缩小了从文本描述中提取的全局面部特征与其对应的正全局嵌入之间的距离。然后，通过AU表示对齐组件将AU特定的特征与积极的AU文本嵌入对齐。设计了负文本嵌入生成策略，进一步提高了两级表示对齐。通过两级对齐，AU文本描述指导AU识别网络的图像表示学习。在两个基准数据集和一个野外数据集上的实验表明，与目前的研究成果相比，描述增强的AU识别方法是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Affective Computing COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS

CiteScore

15.00

自引率

6.20%

发文量

174

期刊介绍： The IEEE Transactions on Affective Computing is an international and interdisciplinary journal. Its primary goal is to share research findings on the development of systems capable of recognizing, interpreting, and simulating human emotions and related affective phenomena. The journal publishes original research on the underlying principles and theories that explain how and why affective factors shape human-technology interactions. It also focuses on how techniques for sensing and simulating affect can enhance our understanding of human emotions and processes. Additionally, the journal explores the design, implementation, and evaluation of systems that prioritize the consideration of affect in their usability. We also welcome surveys of existing work that provide new perspectives on the historical and future directions of this field.