Leili Tavabi, Brian Borsari, Kalin Stefanov, Joshua D Woolley, Mohammad Soleymani, Larry Zhang, Stefan Scherer
{"title":"Multimodal Automatic Coding of Client Behavior in Motivational Interviewing.","authors":"Leili Tavabi, Brian Borsari, Kalin Stefanov, Joshua D Woolley, Mohammad Soleymani, Larry Zhang, Stefan Scherer","doi":"10.1145/3382507.3418853","DOIUrl":null,"url":null,"abstract":"<p><p>Motivational Interviewing (MI) is defined as a collaborative conversation style that evokes the client's own intrinsic reasons for behavioral change. In MI research, the clients' attitude (willingness or resistance) toward change as expressed through language, has been identified as an important indicator of their subsequent behavior change. Automated coding of these indicators provides systematic and efficient means for the analysis and assessment of MI therapy sessions. In this paper, we study and analyze behavioral cues in client language and speech that bear indications of the client's behavior toward change during a therapy session, using a database of dyadic motivational interviews between therapists and clients with alcohol-related problems. Deep language and voice encoders, <i>i.e.,</i> BERT and VGGish, trained on large amounts of data are used to extract features from each utterance. We develop a neural network to automatically detect the MI codes using both the clients' and therapists' language and clients' voice, and demonstrate the importance of semantic context in such detection. Additionally, we develop machine learning models for predicting alcohol-use behavioral outcomes of clients through language and voice analysis. Our analysis demonstrates that we are able to estimate MI codes using clients' textual utterances along with preceding textual context from both the therapist and client, reaching an F1-score of 0.72 for a speaker-independent three-class classification. We also report initial results for using the clients' data for predicting behavioral outcomes, which outlines the direction for future work.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2020 ","pages":"406-413"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8321780/pdf/nihms-1727152.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3382507.3418853","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivational Interviewing (MI) is defined as a collaborative conversation style that evokes the client's own intrinsic reasons for behavioral change. In MI research, the clients' attitude (willingness or resistance) toward change as expressed through language, has been identified as an important indicator of their subsequent behavior change. Automated coding of these indicators provides systematic and efficient means for the analysis and assessment of MI therapy sessions. In this paper, we study and analyze behavioral cues in client language and speech that bear indications of the client's behavior toward change during a therapy session, using a database of dyadic motivational interviews between therapists and clients with alcohol-related problems. Deep language and voice encoders, i.e., BERT and VGGish, trained on large amounts of data are used to extract features from each utterance. We develop a neural network to automatically detect the MI codes using both the clients' and therapists' language and clients' voice, and demonstrate the importance of semantic context in such detection. Additionally, we develop machine learning models for predicting alcohol-use behavioral outcomes of clients through language and voice analysis. Our analysis demonstrates that we are able to estimate MI codes using clients' textual utterances along with preceding textual context from both the therapist and client, reaching an F1-score of 0.72 for a speaker-independent three-class classification. We also report initial results for using the clients' data for predicting behavioral outcomes, which outlines the direction for future work.
动机访谈法(MI)被定义为一种合作式谈话方式,它能唤起客户自身内在的行为改变原因。在动机访谈研究中,客户通过语言表达的对改变的态度(意愿或抵制)被认为是他们随后行为改变的一个重要指标。对这些指标进行自动编码,为分析和评估多元智能疗法的疗程提供了系统而有效的方法。在本文中,我们利用治疗师与有酒精相关问题的客户之间的双人动机访谈数据库,研究并分析了客户语言和语音中的行为线索,这些线索表明客户在治疗过程中的改变行为。深度语言和语音编码器(即 BERT 和 VGGish)在大量数据的基础上经过训练,可用于从每个语句中提取特征。我们开发了一个神经网络,利用客户和治疗师的语言以及客户的声音自动检测 MI 代码,并证明了语义上下文在此类检测中的重要性。此外,我们还开发了机器学习模型,通过语言和语音分析预测客户的酒精使用行为结果。我们的分析表明,我们能够利用客户的文本语句以及治疗师和客户之前的文本上下文来估算 MI 代码,在与说话者无关的三类分类中达到了 0.72 的 F1 分数。我们还报告了使用客户数据预测行为结果的初步结果,这为今后的工作指明了方向。