Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation

IF 9.8 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Affective Computing Pub Date : 2025-02-21 DOI:10.1109/TAFFC.2025.3544594
Hui Ma;Bo Zhang;Bo Xu;Jian Wang;Hongfei Lin;Xiao Sun
{"title":"Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation","authors":"Hui Ma;Bo Zhang;Bo Xu;Jian Wang;Hongfei Lin;Xiao Sun","doi":"10.1109/TAFFC.2025.3544594","DOIUrl":null,"url":null,"abstract":"Empathetic response generation, aiming to understand the user’s situation and feelings and respond empathically, is crucial in building human-like dialogue systems. Traditional approaches typically employ maximum likelihood estimation as the optimization objective during training, yet fail to align the empathy levels between generated and target responses. To this end, we propose an empathetic response generation framework using reinforcement learning (EmpRL). The framework develops an effective empathy reward function and generates empathetic responses by maximizing the expected reward through reinforcement learning. EmpRL utilizes the pre-trained T5 model as the generator and further fine-tunes it to initialize the policy. To align the empathy levels between generated and target responses within a given context, an empathy reward function containing three empathy communication mechanisms—emotional reaction, interpretation, and exploration—is constructed using pre-designed and pre-trained empathy identifiers. During reinforcement learning training, the proximal policy optimization algorithm is used to fine-tune the policy, enabling the generation of empathetic responses. Both automatic and human evaluations demonstrate that the proposed EmpRL framework significantly improves the quality of generated responses, enhances the similarity in empathy levels between generated and target responses, and produces empathetic responses covering both affective and cognitive aspects.","PeriodicalId":13131,"journal":{"name":"IEEE Transactions on Affective Computing","volume":"16 3","pages":"1873-1884"},"PeriodicalIF":9.8000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Affective Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10899840/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Empathetic response generation, aiming to understand the user’s situation and feelings and respond empathically, is crucial in building human-like dialogue systems. Traditional approaches typically employ maximum likelihood estimation as the optimization objective during training, yet fail to align the empathy levels between generated and target responses. To this end, we propose an empathetic response generation framework using reinforcement learning (EmpRL). The framework develops an effective empathy reward function and generates empathetic responses by maximizing the expected reward through reinforcement learning. EmpRL utilizes the pre-trained T5 model as the generator and further fine-tunes it to initialize the policy. To align the empathy levels between generated and target responses within a given context, an empathy reward function containing three empathy communication mechanisms—emotional reaction, interpretation, and exploration—is constructed using pre-designed and pre-trained empathy identifiers. During reinforcement learning training, the proximal policy optimization algorithm is used to fine-tune the policy, enabling the generation of empathetic responses. Both automatic and human evaluations demonstrate that the proposed EmpRL framework significantly improves the quality of generated responses, enhances the similarity in empathy levels between generated and target responses, and produces empathetic responses covering both affective and cognitive aspects.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过强化学习对共情反应生成的共情水平校准
移情响应生成,旨在理解用户的情况和感受,并进行移情响应,是构建类人对话系统的关键。传统方法在训练过程中通常采用最大似然估计作为优化目标,但未能使生成的反应和目标反应之间的共情水平保持一致。为此,我们提出了一个使用强化学习(EmpRL)的共情反应生成框架。该框架开发了有效的共情奖励功能,并通过强化学习最大化预期奖励来产生共情反应。EmpRL利用预训练的T5模型作为生成器,并进一步对其进行微调以初始化策略。为了在给定情境中使产生的共情反应和目标反应之间的共情水平保持一致,我们使用预先设计和预先训练的共情标识符构建了包含三种共情沟通机制——情绪反应、解释和探索的共情奖励函数。在强化学习训练过程中,使用近端策略优化算法对策略进行微调,从而产生共情响应。结果表明,EmpRL框架显著提高了生成反应的质量,增强了生成反应与目标反应之间共情水平的相似性,并产生了涵盖情感和认知两个方面的共情反应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Affective Computing
IEEE Transactions on Affective Computing COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS
CiteScore
15.00
自引率
6.20%
发文量
174
期刊介绍: The IEEE Transactions on Affective Computing is an international and interdisciplinary journal. Its primary goal is to share research findings on the development of systems capable of recognizing, interpreting, and simulating human emotions and related affective phenomena. The journal publishes original research on the underlying principles and theories that explain how and why affective factors shape human-technology interactions. It also focuses on how techniques for sensing and simulating affect can enhance our understanding of human emotions and processes. Additionally, the journal explores the design, implementation, and evaluation of systems that prioritize the consideration of affect in their usability. We also welcome surveys of existing work that provide new perspectives on the historical and future directions of this field.
期刊最新文献
Video-Based Cross-Domain Emotion Recognition Via Sample-Graph Relations Self-Distillation EchoReason: a Two-stage Clinically Aligned Vision-Language Framework for Interpretable Diseases Diagnosis from Multi-Modal Ultrasound Advancing Micro-Expression Recognition: a Task-Specific Framework Integrating Frequency Analysis and Structural Embedding Facial Expression Recognition for Chinese Elderly Using Edge and Semantic Features Dual Path Network With Two-Step Transfer Learning An EEG-Based Multi-Source Domain Knowledge Transfer Framework for Cross-Session and Cross-Subject Emotion Recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1