Learning to Generate Context-Sensitive Backchannel Smiles for Embodied AI Agents with Applications in Mental Health Dialogues.

CEUR workshop proceedings Pub Date : 2024-02-01
Maneesh Bilalpur, Mert Inan, Dorsa Zeinali, Jeffrey F Cohn, Malihe Alikhani
{"title":"Learning to Generate Context-Sensitive Backchannel Smiles for Embodied AI Agents with Applications in Mental Health Dialogues.","authors":"Maneesh Bilalpur, Mert Inan, Dorsa Zeinali, Jeffrey F Cohn, Malihe Alikhani","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Addressing the critical shortage of mental health resources for effective screening, diagnosis, and treatment remains a significant challenge. This scarcity underscores the need for innovative solutions, particularly in enhancing the accessibility and efficacy of therapeutic support. Embodied agents with advanced interactive capabilities emerge as a promising and cost-effective supplement to traditional caregiving methods. Crucial to these agents' effectiveness is their ability to simulate non-verbal behaviors, like backchannels, that are pivotal in establishing rapport and understanding in therapeutic contexts but remain under-explored. To improve the rapport-building capabilities of embodied agents we annotated backchannel smiles in videos of intimate face-to-face conversations over topics such as mental health, illness, and relationships. We hypothesized that both speaker and listener behaviors affect the duration and intensity of backchannel smiles. Using cues from speech prosody and language along with the demographics of the speaker and listener, we found them to contain significant predictors of the intensity of backchannel smiles. Based on our findings, we introduce backchannel smile production in embodied agents as a generation problem. Our attention-based generative model suggests that listener information offers performance improvements over the baseline speaker-centric generation approach. Conditioned generation using the significant predictors of smile intensity provides statistically significant improvements in empirical measures of generation quality. Our user study by transferring generated smiles to an embodied agent suggests that agent with backchannel smiles is perceived to be more human-like and is an attractive alternative for non-personal conversations over agent without backchannel smiles.</p>","PeriodicalId":72554,"journal":{"name":"CEUR workshop proceedings","volume":"3649 ","pages":"12-22"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11608428/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CEUR workshop proceedings","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Addressing the critical shortage of mental health resources for effective screening, diagnosis, and treatment remains a significant challenge. This scarcity underscores the need for innovative solutions, particularly in enhancing the accessibility and efficacy of therapeutic support. Embodied agents with advanced interactive capabilities emerge as a promising and cost-effective supplement to traditional caregiving methods. Crucial to these agents' effectiveness is their ability to simulate non-verbal behaviors, like backchannels, that are pivotal in establishing rapport and understanding in therapeutic contexts but remain under-explored. To improve the rapport-building capabilities of embodied agents we annotated backchannel smiles in videos of intimate face-to-face conversations over topics such as mental health, illness, and relationships. We hypothesized that both speaker and listener behaviors affect the duration and intensity of backchannel smiles. Using cues from speech prosody and language along with the demographics of the speaker and listener, we found them to contain significant predictors of the intensity of backchannel smiles. Based on our findings, we introduce backchannel smile production in embodied agents as a generation problem. Our attention-based generative model suggests that listener information offers performance improvements over the baseline speaker-centric generation approach. Conditioned generation using the significant predictors of smile intensity provides statistically significant improvements in empirical measures of generation quality. Our user study by transferring generated smiles to an embodied agent suggests that agent with backchannel smiles is perceived to be more human-like and is an attractive alternative for non-personal conversations over agent without backchannel smiles.

分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
学习在心理健康对话中为具身AI代理生成上下文敏感的反向通道微笑。
解决有效筛查、诊断和治疗精神卫生资源严重短缺的问题仍然是一项重大挑战。这种稀缺性强调需要创新的解决办法,特别是在提高治疗支持的可及性和有效性方面。具身代理具有先进的交互能力,是传统护理方法的一种有前途和经济效益的补充。对这些药物的有效性至关重要的是它们模拟非语言行为的能力,比如反向渠道,这对于在治疗环境中建立融洽关系和理解至关重要,但仍未得到充分探索。为了提高具身代理建立融洽关系的能力,我们在关于心理健康、疾病和人际关系等话题的亲密面对面对话视频中注释了反向通道微笑。我们假设说话者和听者的行为都会影响反向微笑的持续时间和强度。通过使用语音韵律和语言的线索以及说话者和听者的人口统计数据,我们发现它们包含了反向通道微笑强度的重要预测因子。基于我们的发现,我们将隐含代理中的反向微笑产生作为一个生成问题引入。我们基于注意力的生成模型表明,与以说话者为中心的生成方法相比,听者信息提供了性能改进。使用微笑强度的显著预测因子的条件生成在生成质量的经验测量中提供了统计上显著的改进。我们通过将生成的微笑转移到一个具身代理的用户研究表明,具有反向通道微笑的代理被认为更像人类,并且与没有反向通道微笑的代理相比,是非个人对话的一个有吸引力的选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.10
自引率
0.00%
发文量
0
期刊最新文献
A Privacy-Preserving Unsupervised Speaker Disentanglement Method for Depression Detection from Speech. Learning to Generate Context-Sensitive Backchannel Smiles for Embodied AI Agents with Applications in Mental Health Dialogues. Internet resources for foreign language education in primary school: challenges and opportunities YouTube as an open resource for foreign language learning: a case study of German Ontology-based representation and analysis of conditional vaccine immune responses using Omics data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1