Katie Aafjes-van Doorn, Marcelo Cicconet, Jeffrey F Cohn, Marc Aafjes
{"title":"预测心理治疗中的工作联盟:一种多模式机器学习方法。","authors":"Katie Aafjes-van Doorn, Marcelo Cicconet, Jeffrey F Cohn, Marc Aafjes","doi":"10.1080/10503307.2024.2428702","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Session-by-session tracking of the working alliance enables clinicians to detect alliance deterioration and intervene accordingly, which has shown to improve treatment outcome, and reduce dropout. Despite this, regular use of alliance self-report measures has failed to gain widespread implementation. We aimed to develop an automated alliance prediction using behavioral features obtained from video-recorded therapy sessions.</p><p><strong>Method: </strong>A naturalistic dataset of session recordings with patient-ratings of working alliance was available for 252 in-person and teletherapy sessions from 47 patients treated by 10 clinicians. Text and audio-based features were extracted from all 252 sessions. Additional video-based feature extraction was possible for a subsample of 80 sessions. We developed a modeling pipeline for audio and text and for audio, text and video to train machine learning regression models that fuse multimodal features.</p><p><strong>Results: </strong>Best results were achieved with a Gradient Boosting architecture, when using audio, text, and video features extracted from the patient (ICC = 0.66, Pearson <i>r </i>= 0.70, MAE = 0.33).</p><p><strong>Conclusion: </strong>Automated alliance prediction from video-recorded therapy sessions is feasible with high accuracy. A data-driven multimodal approach to feature extraction and selection enables powerful models, outperforming previous work.</p>","PeriodicalId":48159,"journal":{"name":"Psychotherapy Research","volume":" ","pages":"256-270"},"PeriodicalIF":2.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting working alliance in psychotherapy: A multi-modal machine learning approach.\",\"authors\":\"Katie Aafjes-van Doorn, Marcelo Cicconet, Jeffrey F Cohn, Marc Aafjes\",\"doi\":\"10.1080/10503307.2024.2428702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Session-by-session tracking of the working alliance enables clinicians to detect alliance deterioration and intervene accordingly, which has shown to improve treatment outcome, and reduce dropout. Despite this, regular use of alliance self-report measures has failed to gain widespread implementation. We aimed to develop an automated alliance prediction using behavioral features obtained from video-recorded therapy sessions.</p><p><strong>Method: </strong>A naturalistic dataset of session recordings with patient-ratings of working alliance was available for 252 in-person and teletherapy sessions from 47 patients treated by 10 clinicians. Text and audio-based features were extracted from all 252 sessions. Additional video-based feature extraction was possible for a subsample of 80 sessions. We developed a modeling pipeline for audio and text and for audio, text and video to train machine learning regression models that fuse multimodal features.</p><p><strong>Results: </strong>Best results were achieved with a Gradient Boosting architecture, when using audio, text, and video features extracted from the patient (ICC = 0.66, Pearson <i>r </i>= 0.70, MAE = 0.33).</p><p><strong>Conclusion: </strong>Automated alliance prediction from video-recorded therapy sessions is feasible with high accuracy. A data-driven multimodal approach to feature extraction and selection enables powerful models, outperforming previous work.</p>\",\"PeriodicalId\":48159,\"journal\":{\"name\":\"Psychotherapy Research\",\"volume\":\" \",\"pages\":\"256-270\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychotherapy Research\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1080/10503307.2024.2428702\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"PSYCHOLOGY, CLINICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychotherapy Research","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1080/10503307.2024.2428702","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}
引用次数: 0
摘要
目的:对工作联盟进行逐节跟踪,使临床医生能够发现联盟恶化并进行相应干预,这已被证明可以改善治疗效果,减少辍学率。尽管如此,定期使用联盟自我报告措施未能得到广泛实施。我们的目标是开发一种自动联盟预测,使用从视频记录治疗过程中获得的行为特征。方法:对10名临床医生治疗的47例患者的252次面对面和远程治疗的会话记录进行自然数据集,并对工作联盟进行患者评分。从所有252个会话中提取了基于文本和音频的特征。对于80个会话的子样本,可以进行额外的基于视频的特征提取。我们为音频和文本以及音频、文本和视频开发了一个建模管道,以训练融合多模态特征的机器学习回归模型。结果:当使用从患者提取的音频、文本和视频特征时,使用Gradient Boosting架构获得了最佳结果(ICC = 0.66, Pearson r = 0.70, MAE = 0.33)。结论:通过视频治疗过程自动预测联盟是可行的,且准确率高。数据驱动的多模态特征提取和选择方法使强大的模型优于以前的工作。
Predicting working alliance in psychotherapy: A multi-modal machine learning approach.
Objective: Session-by-session tracking of the working alliance enables clinicians to detect alliance deterioration and intervene accordingly, which has shown to improve treatment outcome, and reduce dropout. Despite this, regular use of alliance self-report measures has failed to gain widespread implementation. We aimed to develop an automated alliance prediction using behavioral features obtained from video-recorded therapy sessions.
Method: A naturalistic dataset of session recordings with patient-ratings of working alliance was available for 252 in-person and teletherapy sessions from 47 patients treated by 10 clinicians. Text and audio-based features were extracted from all 252 sessions. Additional video-based feature extraction was possible for a subsample of 80 sessions. We developed a modeling pipeline for audio and text and for audio, text and video to train machine learning regression models that fuse multimodal features.
Results: Best results were achieved with a Gradient Boosting architecture, when using audio, text, and video features extracted from the patient (ICC = 0.66, Pearson r = 0.70, MAE = 0.33).
Conclusion: Automated alliance prediction from video-recorded therapy sessions is feasible with high accuracy. A data-driven multimodal approach to feature extraction and selection enables powerful models, outperforming previous work.
期刊介绍:
Psychotherapy Research seeks to enhance the development, scientific quality, and social relevance of psychotherapy research and to foster the use of research findings in practice, education, and policy formulation. The Journal publishes reports of original research on all aspects of psychotherapy, including its outcomes, its processes, education of practitioners, and delivery of services. It also publishes methodological, theoretical, and review articles of direct relevance to psychotherapy research. The Journal is addressed to an international, interdisciplinary audience and welcomes submissions dealing with diverse theoretical orientations, treatment modalities.