首页 > 最新文献

Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)最新文献

英文 中文
Multimodal Automatic Coding of Client Behavior in Motivational Interviewing. 动机访谈中客户行为的多模式自动编码。
Leili Tavabi, Brian Borsari, Kalin Stefanov, Joshua D Woolley, Mohammad Soleymani, Larry Zhang, Stefan Scherer

Motivational Interviewing (MI) is defined as a collaborative conversation style that evokes the client's own intrinsic reasons for behavioral change. In MI research, the clients' attitude (willingness or resistance) toward change as expressed through language, has been identified as an important indicator of their subsequent behavior change. Automated coding of these indicators provides systematic and efficient means for the analysis and assessment of MI therapy sessions. In this paper, we study and analyze behavioral cues in client language and speech that bear indications of the client's behavior toward change during a therapy session, using a database of dyadic motivational interviews between therapists and clients with alcohol-related problems. Deep language and voice encoders, i.e., BERT and VGGish, trained on large amounts of data are used to extract features from each utterance. We develop a neural network to automatically detect the MI codes using both the clients' and therapists' language and clients' voice, and demonstrate the importance of semantic context in such detection. Additionally, we develop machine learning models for predicting alcohol-use behavioral outcomes of clients through language and voice analysis. Our analysis demonstrates that we are able to estimate MI codes using clients' textual utterances along with preceding textual context from both the therapist and client, reaching an F1-score of 0.72 for a speaker-independent three-class classification. We also report initial results for using the clients' data for predicting behavioral outcomes, which outlines the direction for future work.

动机访谈法(MI)被定义为一种合作式谈话方式,它能唤起客户自身内在的行为改变原因。在动机访谈研究中,客户通过语言表达的对改变的态度(意愿或抵制)被认为是他们随后行为改变的一个重要指标。对这些指标进行自动编码,为分析和评估多元智能疗法的疗程提供了系统而有效的方法。在本文中,我们利用治疗师与有酒精相关问题的客户之间的双人动机访谈数据库,研究并分析了客户语言和语音中的行为线索,这些线索表明客户在治疗过程中的改变行为。深度语言和语音编码器(即 BERT 和 VGGish)在大量数据的基础上经过训练,可用于从每个语句中提取特征。我们开发了一个神经网络,利用客户和治疗师的语言以及客户的声音自动检测 MI 代码,并证明了语义上下文在此类检测中的重要性。此外,我们还开发了机器学习模型,通过语言和语音分析预测客户的酒精使用行为结果。我们的分析表明,我们能够利用客户的文本语句以及治疗师和客户之前的文本上下文来估算 MI 代码,在与说话者无关的三类分类中达到了 0.72 的 F1 分数。我们还报告了使用客户数据预测行为结果的初步结果,这为今后的工作指明了方向。
{"title":"Multimodal Automatic Coding of Client Behavior in Motivational Interviewing.","authors":"Leili Tavabi, Brian Borsari, Kalin Stefanov, Joshua D Woolley, Mohammad Soleymani, Larry Zhang, Stefan Scherer","doi":"10.1145/3382507.3418853","DOIUrl":"10.1145/3382507.3418853","url":null,"abstract":"<p><p>Motivational Interviewing (MI) is defined as a collaborative conversation style that evokes the client's own intrinsic reasons for behavioral change. In MI research, the clients' attitude (willingness or resistance) toward change as expressed through language, has been identified as an important indicator of their subsequent behavior change. Automated coding of these indicators provides systematic and efficient means for the analysis and assessment of MI therapy sessions. In this paper, we study and analyze behavioral cues in client language and speech that bear indications of the client's behavior toward change during a therapy session, using a database of dyadic motivational interviews between therapists and clients with alcohol-related problems. Deep language and voice encoders, <i>i.e.,</i> BERT and VGGish, trained on large amounts of data are used to extract features from each utterance. We develop a neural network to automatically detect the MI codes using both the clients' and therapists' language and clients' voice, and demonstrate the importance of semantic context in such detection. Additionally, we develop machine learning models for predicting alcohol-use behavioral outcomes of clients through language and voice analysis. Our analysis demonstrates that we are able to estimate MI codes using clients' textual utterances along with preceding textual context from both the therapist and client, reaching an F1-score of 0.72 for a speaker-independent three-class classification. We also report initial results for using the clients' data for predicting behavioral outcomes, which outlines the direction for future work.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2020 ","pages":"406-413"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8321780/pdf/nihms-1727152.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39266881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Affect Detection in Deep Brain Stimulation for Obsessive-Compulsive Disorder: A Pilot Study. 强迫症脑深部刺激中的自动情感检测:一项初步研究。
Jeffrey F Cohn, Michael S Okun, Laszlo A Jeni, Itir Onal Ertugrul, David Borton, Donald Malone, Wayne K Goodman

Automated measurement of affective behavior in psychopathology has been limited primarily to screening and diagnosis. While useful, clinicians more often are concerned with whether patients are improving in response to treatment. Are symptoms abating, is affect becoming more positive, are unanticipated side effects emerging? When treatment includes neural implants, need for objective, repeatable biometrics tied to neurophysiology becomes especially pressing. We used automated face analysis to assess treatment response to deep brain stimulation (DBS) in two patients with intractable obsessive-compulsive disorder (OCD). One was assessed intraoperatively following implantation and activation of the DBS device. The other was assessed three months post-implantation. Both were assessed during DBS on and o conditions. Positive and negative valence were quantified using a CNN trained on normative data of 160 non-OCD participants. Thus, a secondary goal was domain transfer of the classifiers. In both contexts, DBS-on resulted in marked positive affect. In response to DBS-off, affect flattened in both contexts and alternated with increased negative affect in the outpatient setting. Mean AUC for domain transfer was 0.87. These findings suggest that parametric variation of DBS is strongly related to affective behavior and may introduce vulnerability for negative affect in the event that DBS is discontinued.

精神病理学中情感行为的自动测量主要局限于筛查和诊断。虽然有用,但临床医生更关心的是患者对治疗的反应是否有所改善。症状是否减轻,影响是否变得更加积极,是否出现了意想不到的副作用?当治疗包括神经植入物时,对与神经生理学相关的客观、可重复的生物特征的需求变得尤为迫切。我们使用自动面部分析来评估两名顽固性强迫症(OCD)患者对深部脑刺激(DBS)的治疗反应。其中一例在DBS装置植入和激活后进行了术中评估。另一个在植入后三个月进行评估。在DBS on和o条件下对两者进行了评估。使用对160名非强迫症参与者的规范性数据进行训练的CNN对阳性和阴性效价进行量化。因此,第二个目标是分类器的域转移。在这两种情况下,DBS都产生了显著的积极影响。在DBS关闭的情况下,情绪在两种情况下都趋于平缓,在门诊环境中与增加的负面情绪交替出现。结构域转移的平均AUC为0.87。这些发现表明,DBS的参数变化与情感行为密切相关,并可能在DBS停止的情况下引入负面影响的脆弱性。
{"title":"Automated Affect Detection in Deep Brain Stimulation for Obsessive-Compulsive Disorder: A Pilot Study.","authors":"Jeffrey F Cohn,&nbsp;Michael S Okun,&nbsp;Laszlo A Jeni,&nbsp;Itir Onal Ertugrul,&nbsp;David Borton,&nbsp;Donald Malone,&nbsp;Wayne K Goodman","doi":"10.1145/3242969.3243023","DOIUrl":"10.1145/3242969.3243023","url":null,"abstract":"<p><p>Automated measurement of affective behavior in psychopathology has been limited primarily to screening and diagnosis. While useful, clinicians more often are concerned with whether patients are improving in response to treatment. Are symptoms abating, is affect becoming more positive, are unanticipated side effects emerging? When treatment includes neural implants, need for objective, repeatable biometrics tied to neurophysiology becomes especially pressing. We used automated face analysis to assess treatment response to deep brain stimulation (DBS) in two patients with intractable obsessive-compulsive disorder (OCD). One was assessed intraoperatively following implantation and activation of the DBS device. The other was assessed three months post-implantation. Both were assessed during DBS on and o conditions. Positive and negative valence were quantified using a CNN trained on normative data of 160 non-OCD participants. Thus, a secondary goal was domain transfer of the classifiers. In both contexts, DBS-on resulted in marked positive affect. In response to DBS-off, affect flattened in both contexts and alternated with increased negative affect in the outpatient setting. Mean AUC for domain transfer was 0.87. These findings suggest that parametric variation of DBS is strongly related to affective behavior and may introduce vulnerability for negative affect in the event that DBS is discontinued.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2018 ","pages":"40-44"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3242969.3243023","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36748553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View. 第一人称视角下基于手的社会互动识别的视点整合。
Sven Bambach, David J Crandall, Chen Yu

Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer's everyday personal activities, which could be used for intelligent human-computer interfaces and other applications. We explore one possible application by investigating how egocentric video data collected from head-mounted cameras can be used to recognize social activities between two interacting partners (e.g. playing chess or cards). In particular, we demonstrate that just the positions and poses of hands within the first-person view are highly informative for activity recognition, and present a computer vision approach that detects hands to automatically estimate activities. While hand pose detection is imperfect, we show that combining evidence across first-person views from the two social partners significantly improves activity recognition accuracy. This result highlights how integrating weak but complimentary sources of evidence from social partners engaged in the same task can help to recognize the nature of their interaction.

从第一人称相机(GoPro、谷歌眼镜)到智能手表(Apple Watch),再到运动追踪器(FitBit),可穿戴设备正在成为人们日常生活的一部分。这些设备通常配备了先进的传感器,可以收集有关佩戴者和环境的数据。这些传感器提供了识别和分析佩戴者日常个人活动的新方法,可用于智能人机界面和其他应用。我们通过研究从头戴式摄像机收集的以自我为中心的视频数据如何用于识别两个互动伙伴之间的社交活动(例如下棋或打牌)来探索一种可能的应用。特别是,我们证明了第一人称视角下的手的位置和姿势对活动识别具有很高的信息量,并提出了一种检测手以自动估计活动的计算机视觉方法。虽然手部姿势检测并不完美,但我们表明,结合来自两个社会伙伴的第一人称视角的证据,可以显著提高活动识别的准确性。这一结果强调了如何整合来自从事同一任务的社会伙伴的微弱但互补的证据来源,有助于认识他们互动的本质。
{"title":"Viewpoint Integration for Hand-Based Recognition of Social Interactions from a First-Person View.","authors":"Sven Bambach,&nbsp;David J Crandall,&nbsp;Chen Yu","doi":"10.1145/2818346.2820771","DOIUrl":"https://doi.org/10.1145/2818346.2820771","url":null,"abstract":"<p><p>Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer's everyday personal activities, which could be used for intelligent human-computer interfaces and other applications. We explore one possible application by investigating how egocentric video data collected from head-mounted cameras can be used to recognize social activities between two interacting partners (e.g. playing chess or cards). In particular, we demonstrate that just the positions and poses of hands within the first-person view are highly informative for activity recognition, and present a computer vision approach that detects hands to automatically estimate activities. While hand pose detection is imperfect, we show that combining evidence across first-person views from the two social partners significantly improves activity recognition accuracy. This result highlights how integrating weak but complimentary sources of evidence from social partners engaged in the same task can help to recognize the nature of their interaction.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2015 ","pages":"351-354"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2818346.2820771","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35459048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Multimodal Detection of Depression in Clinical Interviews. 临床访谈中抑郁症的多模态检测。
Hamdi Dibeklioğlu, Zakia Hammal, Ying Yang, Jeffrey F Cohn

Current methods for depression assessment depend almost entirely on clinical interview or self-report ratings. Such measures lack systematic and efficient ways of incorporating behavioral observations that are strong indicators of psychological disorder. We compared a clinical interview of depression severity with automatic measurement in 48 participants undergoing treatment for depression. Interviews were obtained at 7-week intervals on up to four occasions. Following standard cut-offs, participants at each session were classified as remitted, intermediate, or depressed. Logistic regression classifiers using leave-one-out validation were compared for facial movement dynamics, head movement dynamics, and vocal prosody individually and in combination. Accuracy (remitted versus depressed) for facial movement dynamics was higher than that for head movement dynamics; and each was substantially higher than that for vocal prosody. Accuracy for all three modalities together reached 88.93%, exceeding that for any single modality or pair of modalities. These findings suggest that automatic detection of depression from behavioral indicators is feasible and that multimodal measures afford most powerful detection.

目前的抑郁症评估方法几乎完全依赖于临床访谈或自我报告评分。这些措施缺乏系统和有效的方法来纳入行为观察,而行为观察是心理障碍的有力指标。我们比较了48名接受抑郁症治疗的参与者的抑郁严重程度的临床访谈和自动测量。每隔7周进行4次访谈。按照标准的临界值,每次会议的参与者被分为轻度、中度和抑郁。使用留一验证的逻辑回归分类器对面部运动动态、头部运动动态和声乐韵律单独和组合进行了比较。面部运动动态的准确性(缓解与压抑)高于头部运动动态;每一项都明显高于声乐韵律。这三种模式的准确率达到了88.93%,超过了任何单一模式或对模式的准确率。这些发现表明,从行为指标自动检测抑郁症是可行的,多模式的措施提供了最有效的检测。
{"title":"Multimodal Detection of Depression in Clinical Interviews.","authors":"Hamdi Dibeklioğlu,&nbsp;Zakia Hammal,&nbsp;Ying Yang,&nbsp;Jeffrey F Cohn","doi":"10.1145/2818346.2820776","DOIUrl":"https://doi.org/10.1145/2818346.2820776","url":null,"abstract":"<p><p>Current methods for depression assessment depend almost entirely on clinical interview or self-report ratings. Such measures lack systematic and efficient ways of incorporating behavioral observations that are strong indicators of psychological disorder. We compared a clinical interview of depression severity with automatic measurement in 48 participants undergoing treatment for depression. Interviews were obtained at 7-week intervals on up to four occasions. Following standard cut-offs, participants at each session were classified as remitted, intermediate, or depressed. Logistic regression classifiers using leave-one-out validation were compared for facial movement dynamics, head movement dynamics, and vocal prosody individually and in combination. Accuracy (remitted versus depressed) for facial movement dynamics was higher than that for head movement dynamics; and each was substantially higher than that for vocal prosody. Accuracy for all three modalities together reached 88.93%, exceeding that for any single modality or pair of modalities. These findings suggest that automatic detection of depression from behavioral indicators is feasible and that multimodal measures afford most powerful detection.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2015 ","pages":"307-310"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2818346.2820776","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34416673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Dyadic Behavior Analysis in Depression Severity Assessment Interviews. 抑郁症严重程度评估访谈中的二元行为分析。
Stefan Scherer, Zakia Hammal, Ying Yang, Louis-Philippe Morency, Jeffrey F Cohn

Previous literature suggests that depression impacts vocal timing of both participants and clinical interviewers but is mixed with respect to acoustic features. To investigate further, 57 middle-aged adults (men and women) with Major Depression Disorder and their clinical interviewers (all women) were studied. Participants were interviewed for depression severity on up to four occasions over a 21 week period using the Hamilton Rating Scale for Depression (HRSD), which is a criterion measure for depression severity in clinical trials. Acoustic features were extracted for both participants and interviewers using COVAREP Toolbox. Missing data occurred due to missed appointments, technical problems, or insufficient vocal samples. Data from 36 participants and their interviewers met criteria and were included for analysis to compare between high and low depression severity. Acoustic features for participants varied between men and women as expected, and failed to vary with depression severity for participants. For interviewers, acoustic characteristics strongly varied with severity of the interviewee's depression. Accommodation - the tendency of interactants to adapt their communicative behavior to each other - between interviewers and interviewees was inversely related to depression severity. These findings suggest that interviewers modify their acoustic features in response to depression severity, and depression severity strongly impacts interpersonal accommodation.

先前的文献表明,抑郁症影响了参与者和临床采访者的发声时间,但与声学特征相混合。为了进一步调查,研究了57名患有重度抑郁症的中年成年人(男性和女性)及其临床访谈者(均为女性)。在21周的时间里,研究人员使用汉密尔顿抑郁量表(HRSD)对参与者进行了多达四次的抑郁严重程度访谈,这是临床试验中衡量抑郁严重程度的标准。使用COVAREP工具箱提取参与者和采访者的声学特征。由于错过预约、技术问题或声音样本不足而导致数据丢失。来自36名参与者及其采访者的数据符合标准,并被纳入分析,以比较高低抑郁严重程度。正如预期的那样,参与者的声学特征在男性和女性之间有所不同,并且没有随参与者的抑郁严重程度而变化。对于访谈者来说,声学特征随着访谈者抑郁的严重程度而强烈变化。采访者和被采访者之间的适应——互动者相互调整沟通行为的倾向——与抑郁严重程度呈负相关。这些研究结果表明,访谈者会根据抑郁严重程度改变他们的声学特征,而抑郁严重程度会强烈影响人际适应。
{"title":"Dyadic Behavior Analysis in Depression Severity Assessment Interviews.","authors":"Stefan Scherer, Zakia Hammal, Ying Yang, Louis-Philippe Morency, Jeffrey F Cohn","doi":"10.1145/2663204.2663238","DOIUrl":"10.1145/2663204.2663238","url":null,"abstract":"<p><p>Previous literature suggests that depression impacts vocal timing of both participants and clinical interviewers but is mixed with respect to acoustic features. To investigate further, 57 middle-aged adults (men and women) with Major Depression Disorder and their clinical interviewers (all women) were studied. Participants were interviewed for depression severity on up to four occasions over a 21 week period using the Hamilton Rating Scale for Depression (HRSD), which is a criterion measure for depression severity in clinical trials. Acoustic features were extracted for both participants and interviewers using COVAREP Toolbox. Missing data occurred due to missed appointments, technical problems, or insufficient vocal samples. Data from 36 participants and their interviewers met criteria and were included for analysis to compare between high and low depression severity. Acoustic features for participants varied between men and women as expected, and failed to vary with depression severity for participants. For interviewers, acoustic characteristics strongly varied with severity of the interviewee's depression. Accommodation - the tendency of interactants to adapt their communicative behavior to each other - between interviewers and interviewees was inversely related to depression severity. These findings suggest that interviewers modify their acoustic features in response to depression severity, and depression severity strongly impacts interpersonal accommodation.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2014 ","pages":"112-119"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2663204.2663238","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34857329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Automatic detection of pain intensity. 自动检测疼痛强度
Zakia Hammal, Jeffrey F Cohn

Previous efforts suggest that occurrence of pain can be detected from the face. Can intensity of pain be detected as well? The Prkachin and Solomon Pain Intensity (PSPI) metric was used to classify four levels of pain intensity (none, trace, weak, and strong) in 25 participants with previous shoulder injury (McMaster-UNBC Pain Archive). Participants were recorded while they completed a series of movements of their affected and unaffected shoulders. From the video recordings, canonical normalized appearance of the face (CAPP) was extracted using active appearance modeling. To control for variation in face size, all CAPP were rescaled to 96×96 pixels. CAPP then was passed through a set of Log-Normal filters consisting of 7 frequencies and 15 orientations to extract 9216 features. To detect pain level, 4 support vector machines (SVMs) were separately trained for the automatic measurement of pain intensity on a frame-by-frame level using both 5-folds cross-validation and leave-one-subject-out cross-validation. F1 for each level of pain intensity ranged from 91% to 96% and from 40% to 67% for 5-folds and leave-one-subject-out cross-validation, respectively. Intra-class correlation, which assesses the consistency of continuous pain intensity between manual and automatic PSPI was 0.85 and 0.55 for 5-folds and leave-one-subject-out cross-validation, respectively, which suggests moderate to high consistency. These findings show that pain intensity can be reliably measured from facial expression in participants with orthopedic injury.

以往的研究表明,可以从面部检测到疼痛的发生。那么疼痛强度是否也能被检测出来呢?我们使用 Prkachin 和 Solomon 疼痛强度 (PSPI) 指标对 25 名肩部受过伤的参与者(麦克马斯特-UNBC 疼痛档案)的疼痛强度进行了四级分类(无、微弱、弱和强)。参与者在完成受影响和未受影响肩部的一系列动作时被录制下来。通过主动外观建模,从视频记录中提取了脸部的规范化外观(CAPP)。为了控制脸部大小的变化,所有 CAPP 都被重新调整为 96×96 像素。然后,将 CAPP 通过一组包含 7 个频率和 15 个方向的对数正态滤波器,提取出 9216 个特征。为了检测疼痛程度,使用 5 倍交叉验证和排除一个受试者交叉验证分别训练了 4 个支持向量机 (SVM),用于逐帧自动测量疼痛强度。在 5 倍交叉验证和留空一个受试者交叉验证中,每个疼痛强度级别的 F1 分别为 91% 至 96% 和 40% 至 67%。评估手动和自动 PSPI 之间连续疼痛强度一致性的类内相关性在五折交叉验证和留出一个受试者交叉验证中分别为 0.85 和 0.55,这表明类内相关性为中度到高度一致。这些研究结果表明,可以通过面部表情可靠地测量骨科损伤参与者的疼痛强度。
{"title":"Automatic detection of pain intensity.","authors":"Zakia Hammal, Jeffrey F Cohn","doi":"10.1145/2388676.2388688","DOIUrl":"10.1145/2388676.2388688","url":null,"abstract":"<p><p>Previous efforts suggest that occurrence of pain can be detected from the face. Can intensity of pain be detected as well? The Prkachin and Solomon Pain Intensity (PSPI) metric was used to classify four levels of pain intensity (none, trace, weak, and strong) in 25 participants with previous shoulder injury (McMaster-UNBC Pain Archive). Participants were recorded while they completed a series of movements of their affected and unaffected shoulders. From the video recordings, canonical normalized appearance of the face (CAPP) was extracted using active appearance modeling. To control for variation in face size, all CAPP were rescaled to 96×96 pixels. CAPP then was passed through a set of Log-Normal filters consisting of 7 frequencies and 15 orientations to extract 9216 features. To detect pain level, 4 support vector machines (SVMs) were separately trained for the automatic measurement of pain intensity on a frame-by-frame level using both 5-folds cross-validation and leave-one-subject-out cross-validation. F1 for each level of pain intensity ranged from 91% to 96% and from 40% to 67% for 5-folds and leave-one-subject-out cross-validation, respectively. Intra-class correlation, which assesses the consistency of continuous pain intensity between manual and automatic PSPI was 0.85 and 0.55 for 5-folds and leave-one-subject-out cross-validation, respectively, which suggests moderate to high consistency. These findings show that pain intensity can be reliably measured from facial expression in participants with orthopedic injury.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2012 ","pages":"47-52"},"PeriodicalIF":0.0,"publicationDate":"2012-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7385931/pdf/nihms-1599641.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38205962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering. 基于粒子滤波的自发对话中情感的视频、音频和词汇指标组合。
Arman Savran, Houwei Cao, Miraj Shah, Ani Nenkova, Ragini Verma
We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively.
我们提出的实验融合面部视频,音频和词汇指标的影响估计在二元对话。我们使用从面部视频中提取的纹理描述符的时间统计,各种声学特征和词汇特征的组合,为每个模态创建基于回归的影响估计器。在贝叶斯过滤框架中,通过将这些独立的回归输出作为影响状态的测量,将单模态回归量与粒子滤波结合起来,在贝叶斯过滤框架中,先前的观察通过学习的影响动态提供对当前状态的预测。在视听情感识别挑战数据集上测试,我们的单模态估计器在情感的每个维度上都比官方基线方法获得了更高的分数。我们基于滤波的多模态融合在全连续子挑战和词级子挑战上的相关性能分别为0.344(基线:0.136)和0.280(基线:0.096)。
{"title":"Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering.","authors":"Arman Savran,&nbsp;Houwei Cao,&nbsp;Miraj Shah,&nbsp;Ani Nenkova,&nbsp;Ragini Verma","doi":"10.1145/2388676.2388781","DOIUrl":"https://doi.org/10.1145/2388676.2388781","url":null,"abstract":"We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively.","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2012 ","pages":"485-492"},"PeriodicalIF":0.0,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2388676.2388781","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32734741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
期刊
Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1