使用 GPT-4 为基于动机访谈的戒烟聊天机器人生成后向复杂反映：算法开发与验证。

IF 4.8 2区医学 Q1 PSYCHIATRY Jmir Mental Health Pub Date : 2024-09-26 DOI:10.2196/53778

Ash Tanuj Kumar, Cindy Wang, Alec Dong, Jonathan Rose

{"title":"使用 GPT-4 为基于动机访谈的戒烟聊天机器人生成后向复杂反映：算法开发与验证。","authors":"Ash Tanuj Kumar, Cindy Wang, Alec Dong, Jonathan Rose","doi":"10.2196/53778","DOIUrl":null,"url":null,"abstract":"Background: Motivational interviewing (MI) is a therapeutic technique that has been successful in helping smokers reduce smoking but has limited accessibility due to the high cost and low availability of clinicians. To address this, the MIBot project has sought to develop a chatbot that emulates an MI session with a client with the specific goal of moving an ambivalent smoker toward the direction of quitting. One key element of an MI conversation is reflective listening, where a therapist expresses their understanding of what the client has said by uttering a reflection that encourages the client to continue their thought process. Complex reflections link the client's responses to relevant ideas and facts to enhance this contemplation. Backward-looking complex reflections (BLCRs) link the client's most recent response to a relevant selection of the client's previous statements. Our current chatbot can generate complex reflections-but not BLCRs-using large language models (LLMs) such as GPT-2, which allows the generation of unique, human-like messages customized to client responses. Recent advancements in these models, such as the introduction of GPT-4, provide a novel way to generate complex text by feeding the models instructions and conversational history directly, making this a promising approach to generate BLCRs.Objective: This study aims to develop a method to generate BLCRs for an MI-based smoking cessation chatbot and to measure the method's effectiveness.Methods: LLMs such as GPT-4 can be stimulated to produce specific types of responses to their inputs by \"asking\" them with an English-based description of the desired output. These descriptions are called prompts, and the goal of writing a description that causes an LLM to generate the required output is termed prompt engineering. We evolved an instruction to prompt GPT-4 to generate a BLCR, given the portions of the transcript of the conversation up to the point where the reflection was needed. The approach was tested on 50 previously collected MIBot transcripts of conversations with smokers and was used to generate a total of 150 reflections. The quality of the reflections was rated on a 4-point scale by 3 independent raters to determine whether they met specific criteria for acceptability.Results: Of the 150 generated reflections, 132 (88%) met the level of acceptability. The remaining 18 (12%) had one or more flaws that made them inappropriate as BLCRs. The 3 raters had pairwise agreement on 80% to 88% of these scores.Conclusions: The method presented to generate BLCRs is good enough to be used as one source of reflections in an MI-style conversation but would need an automatic checker to eliminate the unacceptable ones. This work illustrates the power of the new LLMs to generate therapeutic client-specific responses under the command of a language-based specification.","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"11 ","pages":"e53778"},"PeriodicalIF":4.8000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11448290/pdf/","citationCount":"0","resultStr":"{\"title\":\"Generation of Backward-Looking Complex Reflections for a Motivational Interviewing-Based Smoking Cessation Chatbot Using GPT-4: Algorithm Development and Validation.\",\"authors\":\"Ash Tanuj Kumar, Cindy Wang, Alec Dong, Jonathan Rose\",\"doi\":\"10.2196/53778\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Motivational interviewing (MI) is a therapeutic technique that has been successful in helping smokers reduce smoking but has limited accessibility due to the high cost and low availability of clinicians. To address this, the MIBot project has sought to develop a chatbot that emulates an MI session with a client with the specific goal of moving an ambivalent smoker toward the direction of quitting. One key element of an MI conversation is reflective listening, where a therapist expresses their understanding of what the client has said by uttering a reflection that encourages the client to continue their thought process. Complex reflections link the client's responses to relevant ideas and facts to enhance this contemplation. Backward-looking complex reflections (BLCRs) link the client's most recent response to a relevant selection of the client's previous statements. Our current chatbot can generate complex reflections-but not BLCRs-using large language models (LLMs) such as GPT-2, which allows the generation of unique, human-like messages customized to client responses. Recent advancements in these models, such as the introduction of GPT-4, provide a novel way to generate complex text by feeding the models instructions and conversational history directly, making this a promising approach to generate BLCRs.Objective: This study aims to develop a method to generate BLCRs for an MI-based smoking cessation chatbot and to measure the method's effectiveness.Methods: LLMs such as GPT-4 can be stimulated to produce specific types of responses to their inputs by \\\"asking\\\" them with an English-based description of the desired output. These descriptions are called prompts, and the goal of writing a description that causes an LLM to generate the required output is termed prompt engineering. We evolved an instruction to prompt GPT-4 to generate a BLCR, given the portions of the transcript of the conversation up to the point where the reflection was needed. The approach was tested on 50 previously collected MIBot transcripts of conversations with smokers and was used to generate a total of 150 reflections. The quality of the reflections was rated on a 4-point scale by 3 independent raters to determine whether they met specific criteria for acceptability.Results: Of the 150 generated reflections, 132 (88%) met the level of acceptability. The remaining 18 (12%) had one or more flaws that made them inappropriate as BLCRs. The 3 raters had pairwise agreement on 80% to 88% of these scores.Conclusions: The method presented to generate BLCRs is good enough to be used as one source of reflections in an MI-style conversation but would need an automatic checker to eliminate the unacceptable ones. This work illustrates the power of the new LLMs to generate therapeutic client-specific responses under the command of a language-based specification.\",\"PeriodicalId\":48616,\"journal\":{\"name\":\"Jmir Mental Health\",\"volume\":\"11 \",\"pages\":\"e53778\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11448290/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jmir Mental Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/53778\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jmir Mental Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/53778","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

摘要

背景：动机访谈（MI）是一种成功帮助吸烟者戒烟的治疗技术，但由于成本高、临床医生少等原因，这种技术的普及性有限。为了解决这个问题，MIBot 项目试图开发一个聊天机器人，模拟与客户进行的 MI 会话，具体目标是让矛盾的吸烟者朝着戒烟的方向发展。多元智能对话的一个关键要素是反思性倾听，治疗师通过说出反思来表达他们对客户所说内容的理解，从而鼓励客户继续思考。复杂反思将客户的回答与相关想法和事实联系起来，以加强这种思考。后向复杂反思（BLCR）会将客户最近的回答与客户之前的相关陈述联系起来。我们目前的聊天机器人可以使用 GPT-2 等大型语言模型（LLMs）生成复杂的反思，但不能生成 BLCR，这种模型可以根据客户的回复生成独特的、类似人类的信息。这些模型的最新进展，如 GPT-4 的引入，提供了一种通过直接向模型提供指令和对话历史记录来生成复杂文本的新方法，使其成为生成 BLCR 的一种有前途的方法：本研究旨在为基于 MI 的戒烟聊天机器人开发一种生成 BLCR 的方法，并测量该方法的有效性：方法：GPT-4 等 LLM 可以通过 "询问 "其所需输出的英语描述来刺激其对输入做出特定类型的回应。这些描述被称为 "提示"，而编写能使 LLM 生成所需输出的描述的目标则被称为 "提示工程"。我们开发了一种指令，以提示 GPT-4 生成 BLCR，同时给出对话记录中直到需要反思的部分。我们在之前收集的 50 份 MIBot 与吸烟者的对话记录上对该方法进行了测试，共生成了 150 份反思。反思的质量由 3 位独立评分员按 4 分制进行评分，以确定它们是否符合可接受性的特定标准：结果：在生成的 150 篇反思中，有 132 篇（88%）符合可接受性标准。其余 18 篇（12%）存在一个或多个缺陷，不适合作为 BLCR。在这些评分中，3 位评分者在 80% 至 88% 的评分上达成了一致：本文介绍的生成 BLCR 的方法很好，足以用作多元智能式对话中的一种反思来源，但需要一个自动检查器来消除不可接受的 BLCR。这项工作展示了新的 LLMs 在基于语言的规范指令下生成治疗客户特定响应的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Generation of Backward-Looking Complex Reflections for a Motivational Interviewing-Based Smoking Cessation Chatbot Using GPT-4: Algorithm Development and Validation.

Background: Motivational interviewing (MI) is a therapeutic technique that has been successful in helping smokers reduce smoking but has limited accessibility due to the high cost and low availability of clinicians. To address this, the MIBot project has sought to develop a chatbot that emulates an MI session with a client with the specific goal of moving an ambivalent smoker toward the direction of quitting. One key element of an MI conversation is reflective listening, where a therapist expresses their understanding of what the client has said by uttering a reflection that encourages the client to continue their thought process. Complex reflections link the client's responses to relevant ideas and facts to enhance this contemplation. Backward-looking complex reflections (BLCRs) link the client's most recent response to a relevant selection of the client's previous statements. Our current chatbot can generate complex reflections-but not BLCRs-using large language models (LLMs) such as GPT-2, which allows the generation of unique, human-like messages customized to client responses. Recent advancements in these models, such as the introduction of GPT-4, provide a novel way to generate complex text by feeding the models instructions and conversational history directly, making this a promising approach to generate BLCRs.

Objective: This study aims to develop a method to generate BLCRs for an MI-based smoking cessation chatbot and to measure the method's effectiveness.

Methods: LLMs such as GPT-4 can be stimulated to produce specific types of responses to their inputs by "asking" them with an English-based description of the desired output. These descriptions are called prompts, and the goal of writing a description that causes an LLM to generate the required output is termed prompt engineering. We evolved an instruction to prompt GPT-4 to generate a BLCR, given the portions of the transcript of the conversation up to the point where the reflection was needed. The approach was tested on 50 previously collected MIBot transcripts of conversations with smokers and was used to generate a total of 150 reflections. The quality of the reflections was rated on a 4-point scale by 3 independent raters to determine whether they met specific criteria for acceptability.

Results: Of the 150 generated reflections, 132 (88%) met the level of acceptability. The remaining 18 (12%) had one or more flaws that made them inappropriate as BLCRs. The 3 raters had pairwise agreement on 80% to 88% of these scores.

Conclusions: The method presented to generate BLCRs is good enough to be used as one source of reflections in an MI-style conversation but would need an automatic checker to eliminate the unacceptable ones. This work illustrates the power of the new LLMs to generate therapeutic client-specific responses under the command of a language-based specification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Jmir Mental Health Medicine-Psychiatry and Mental Health

CiteScore

10.80

自引率

3.80%

发文量

104

审稿时长

16 weeks

期刊介绍： JMIR Mental Health (JMH, ISSN 2368-7959) is a PubMed-indexed, peer-reviewed sister journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175). JMIR Mental Health focusses on digital health and Internet interventions, technologies and electronic innovations (software and hardware) for mental health, addictions, online counselling and behaviour change. This includes formative evaluation and system descriptions, theoretical papers, review papers, viewpoint/vision papers, and rigorous evaluations.