追求有效的情感计算:特征与配准之间的关系。

IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics Pub Date : 2012-08-01 Epub Date: 2012-05-07 DOI:10.1109/TSMCB.2012.2194485

S W Chew, P Lucey, S Lucey, J Saragih, J F Cohn, I Matthews, S Sridharan

{"title":"追求有效的情感计算:特征与配准之间的关系。","authors":"S W Chew, P Lucey, S Lucey, J Saragih, J F Cohn, I Matthews, S Sridharan","doi":"10.1109/TSMCB.2012.2194485","DOIUrl":null,"url":null,"abstract":"For facial expression recognition systems to be applicable in the real world, they need to be able to detect and track a previously unseen person's face and its facial movements accurately in realistic environments. A highly plausible solution involves performing a \"dense\" form of alignment, where 60-70 fiducial facial points are tracked with high accuracy. The problem is that, in practice, this type of dense alignment had so far been impossible to achieve in a generic sense, mainly due to poor reliability and robustness. Instead, many expression detection methods have opted for a \"coarse\" form of face alignment, followed by an application of a biologically inspired appearance descriptor such as the histogram of oriented gradients or Gabor magnitudes. Encouragingly, recent advances to a number of dense alignment algorithms have demonstrated both high reliability and accuracy for unseen subjects [e.g., constrained local models (CLMs)]. This begs the question: Aside from countering against illumination variation, what do these appearance descriptors do that standard pixel representations do not? In this paper, we show that, when close to perfect alignment is obtained, there is no real benefit in employing these different appearance-based representations (under consistent illumination conditions). In fact, when misalignment does occur, we show that these appearance descriptors do work well by encoding robustness to alignment error. For this work, we compared two popular methods for dense alignment-subject-dependent active appearance models versus subject-independent CLMs-on the task of action-unit detection. These comparisons were conducted through a battery of experiments across various publicly available data sets (i.e., CK+, Pain, M3, and GEMEP-FERA). We also report our performance in the recent 2011 Facial Expression Recognition and Analysis Challenge for the subject-independent task. ","PeriodicalId":55006,"journal":{"name":"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics","volume":" ","pages":"1006-16"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TSMCB.2012.2194485","citationCount":"67","resultStr":"{\"title\":\"In the Pursuit of Effective Affective Computing: The Relationship Between Features and Registration.\",\"authors\":\"S W Chew, P Lucey, S Lucey, J Saragih, J F Cohn, I Matthews, S Sridharan\",\"doi\":\"10.1109/TSMCB.2012.2194485\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For facial expression recognition systems to be applicable in the real world, they need to be able to detect and track a previously unseen person's face and its facial movements accurately in realistic environments. A highly plausible solution involves performing a \\\"dense\\\" form of alignment, where 60-70 fiducial facial points are tracked with high accuracy. The problem is that, in practice, this type of dense alignment had so far been impossible to achieve in a generic sense, mainly due to poor reliability and robustness. Instead, many expression detection methods have opted for a \\\"coarse\\\" form of face alignment, followed by an application of a biologically inspired appearance descriptor such as the histogram of oriented gradients or Gabor magnitudes. Encouragingly, recent advances to a number of dense alignment algorithms have demonstrated both high reliability and accuracy for unseen subjects [e.g., constrained local models (CLMs)]. This begs the question: Aside from countering against illumination variation, what do these appearance descriptors do that standard pixel representations do not? In this paper, we show that, when close to perfect alignment is obtained, there is no real benefit in employing these different appearance-based representations (under consistent illumination conditions). In fact, when misalignment does occur, we show that these appearance descriptors do work well by encoding robustness to alignment error. For this work, we compared two popular methods for dense alignment-subject-dependent active appearance models versus subject-independent CLMs-on the task of action-unit detection. These comparisons were conducted through a battery of experiments across various publicly available data sets (i.e., CK+, Pain, M3, and GEMEP-FERA). We also report our performance in the recent 2011 Facial Expression Recognition and Analysis Challenge for the subject-independent task. \",\"PeriodicalId\":55006,\"journal\":{\"name\":\"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics\",\"volume\":\" \",\"pages\":\"1006-16\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TSMCB.2012.2194485\",\"citationCount\":\"67\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TSMCB.2012.2194485\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2012/5/7 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSMCB.2012.2194485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/5/7 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 67

摘要

面部表情识别系统要应用于现实世界，就需要能够在现实环境中准确地检测和跟踪以前未见过的人的面部及其面部运动。一个非常合理的解决方案涉及执行“密集”形式的对齐，其中60-70个基准面部点被高精度地跟踪。问题是，在实践中，这种类型的密集对齐到目前为止在一般意义上是不可能实现的，主要是由于可靠性和健壮性差。相反，许多表情检测方法选择了一种“粗糙”的面部排列形式，然后应用生物学启发的外观描述符，如定向梯度直方图或Gabor幅度。令人鼓舞的是，最近许多密集对齐算法的进展表明，对于看不见的对象(例如，约束局部模型(CLMs))，它们具有高可靠性和准确性。这就引出了一个问题:除了对抗光照变化之外，这些外观描述符做了什么标准像素表示没有做的事情?在本文中，我们表明，当获得接近完美的对齐时，使用这些不同的基于外观的表示(在一致的照明条件下)并没有真正的好处。事实上，当不对齐确实发生时，我们通过编码对对齐错误的鲁棒性来证明这些外观描述符确实工作得很好。在这项工作中，我们比较了两种流行的密集对齐方法——主体依赖的活跃外观模型和主体独立的clms——在动作单元检测任务上的差异。这些比较是通过在各种公开可用的数据集(即CK+、Pain、M3和GEMEP-FERA)上进行的一系列实验进行的。我们还报告了我们在最近的2011面部表情识别和分析挑战中独立于主题任务的表现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

In the Pursuit of Effective Affective Computing: The Relationship Between Features and Registration.

For facial expression recognition systems to be applicable in the real world, they need to be able to detect and track a previously unseen person's face and its facial movements accurately in realistic environments. A highly plausible solution involves performing a "dense" form of alignment, where 60-70 fiducial facial points are tracked with high accuracy. The problem is that, in practice, this type of dense alignment had so far been impossible to achieve in a generic sense, mainly due to poor reliability and robustness. Instead, many expression detection methods have opted for a "coarse" form of face alignment, followed by an application of a biologically inspired appearance descriptor such as the histogram of oriented gradients or Gabor magnitudes. Encouragingly, recent advances to a number of dense alignment algorithms have demonstrated both high reliability and accuracy for unseen subjects [e.g., constrained local models (CLMs)]. This begs the question: Aside from countering against illumination variation, what do these appearance descriptors do that standard pixel representations do not? In this paper, we show that, when close to perfect alignment is obtained, there is no real benefit in employing these different appearance-based representations (under consistent illumination conditions). In fact, when misalignment does occur, we show that these appearance descriptors do work well by encoding robustness to alignment error. For this work, we compared two popular methods for dense alignment-subject-dependent active appearance models versus subject-independent CLMs-on the task of action-unit detection. These comparisons were conducted through a battery of experiments across various publicly available data sets (i.e., CK+, Pain, M3, and GEMEP-FERA). We also report our performance in the recent 2011 Facial Expression Recognition and Analysis Challenge for the subject-independent task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Systems Man and Cybernetics Part B-Cybernetics 工程技术-计算机：控制论

自引率

0.00%

发文量

审稿时长

6.0 months