Analysis of Face-Touching Behavior in Large Scale Social Interaction Dataset

Proceedings of the 2020 International Conference on Multimodal Interaction Pub Date : 2020-10-21 DOI:10.1145/3382507.3418876

Cigdem Beyan, Matteo Bustreo, Muhammad Shahid, Gianluca Bailo, N. Carissimi, Alessio Del Bue

{"title":"Analysis of Face-Touching Behavior in Large Scale Social Interaction Dataset","authors":"Cigdem Beyan, Matteo Bustreo, Muhammad Shahid, Gianluca Bailo, N. Carissimi, Alessio Del Bue","doi":"10.1145/3382507.3418876","DOIUrl":null,"url":null,"abstract":"We present the first publicly available annotations for the analysis of face-touching behavior. These annotations are for a dataset composed of audio-visual recordings of small group social interactions with a total number of 64 videos, each one lasting between 12 to 30 minutes and showing a single person while participating to four-people meetings. They were performed by in total 16 annotators with an almost perfect agreement (Cohen's Kappa=0.89) on average. In total, 74K and 2M video frames were labelled as face-touch and no-face-touch, respectively. Given the dataset and the collected annotations, we also present an extensive evaluation of several methods: rule-based, supervised learning with hand-crafted features and feature learning and inference with a Convolutional Neural Network (CNN) for Face-Touching detection. Our evaluation indicates that among all, CNN performed the best, reaching 83.76% F1-score and 0.84 Matthews Correlation Coefficient. To foster future research in this problem, code and dataset were made publicly available (github.com/IIT-PAVIS/Face-Touching-Behavior), providing all video frames, face-touch annotations, body pose estimations including face and hands key-points detection, face bounding boxes as well as the baseline methods implemented and the cross-validation splits used for training and evaluating our models.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3382507.3418876","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

We present the first publicly available annotations for the analysis of face-touching behavior. These annotations are for a dataset composed of audio-visual recordings of small group social interactions with a total number of 64 videos, each one lasting between 12 to 30 minutes and showing a single person while participating to four-people meetings. They were performed by in total 16 annotators with an almost perfect agreement (Cohen's Kappa=0.89) on average. In total, 74K and 2M video frames were labelled as face-touch and no-face-touch, respectively. Given the dataset and the collected annotations, we also present an extensive evaluation of several methods: rule-based, supervised learning with hand-crafted features and feature learning and inference with a Convolutional Neural Network (CNN) for Face-Touching detection. Our evaluation indicates that among all, CNN performed the best, reaching 83.76% F1-score and 0.84 Matthews Correlation Coefficient. To foster future research in this problem, code and dataset were made publicly available (github.com/IIT-PAVIS/Face-Touching-Behavior), providing all video frames, face-touch annotations, body pose estimations including face and hands key-points detection, face bounding boxes as well as the baseline methods implemented and the cross-validation splits used for training and evaluating our models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

大规模社会互动数据集中的触脸行为分析

我们提出了第一个公开可用的注释，用于分析触摸面部的行为。这些注释是针对一个数据集，该数据集由小团体社交互动的视听记录组成，共有64个视频，每个视频持续12到30分钟，显示一个人同时参加四人会议。总共有16位注释者执行了这些注释，平均几乎完全一致(Cohen’s Kappa=0.89)。总共有74K和2M视频帧分别被标记为面部触摸和非面部触摸。考虑到数据集和收集到的注释，我们还对几种方法进行了广泛的评估:基于规则的、有监督的手工特征学习，以及用于面部触摸检测的卷积神经网络(CNN)特征学习和推理。我们的评价表明，其中CNN表现最好，f1得分为83.76%，Matthews相关系数为0.84。为了促进对这个问题的未来研究，代码和数据集被公开(github.com/IIT-PAVIS/Face-Touching-Behavior)，提供了所有视频帧、面部触摸注释、身体姿势估计(包括面部和手部关键点检测)、面部边界框以及实现的基线方法和用于训练和评估我们的模型的交叉验证分割。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2020 International Conference on Multimodal Interaction

自引率

0.00%

发文量

期刊最新文献

OpenSense: A Platform for Multimodal Data Acquisition and Behavior Perception Human-centered Multimodal Machine Intelligence Touch Recognition with Attentive End-to-End Model MORSE: MultimOdal sentiment analysis for Real-life SEttings Temporal Attention and Consistency Measuring for Video Question Answering