ChatReview：支持 ChatGPT 的自然语言处理框架，用于研究特定领域的用户评论

IF 4.9 Machine learning with applications Pub Date : 2023-12-28 DOI:10.1016/j.mlwa.2023.100522

Brittany Ho, Ta’Rhonda Mayberry, Khanh Linh Nguyen, Manohar Dhulipala, Vivek Krishnamani Pallipuram

{"title":"ChatReview：支持 ChatGPT 的自然语言处理框架，用于研究特定领域的用户评论","authors":"Brittany Ho, Ta’Rhonda Mayberry, Khanh Linh Nguyen, Manohar Dhulipala, Vivek Krishnamani Pallipuram","doi":"10.1016/j.mlwa.2023.100522","DOIUrl":null,"url":null,"abstract":"<div><p>Intelligent search engines including pre-trained generative transformers (GPT) have revolutionized the user search experience. Several fields including e-commerce, education, and hospitality are increasingly exploring GPT tools to study user reviews and gain critical insights to improve their service quality. However, massive user-review data and imprecise prompt engineering lead to biased, irrelevant, and impersonal search results. In addition, exposing user data to these search engines may pose privacy issues. Motivated by these factors, we present ChatReview, a ChatGPT-enabled natural language processing (NLP) framework that effectively studies domain-specific user reviews to offer relevant and personalized search results at multiple levels of granularity. The framework accomplishes this task using four phases including data collection, tokenization, query construction, and response generation. The data collection phase involves gathering domain-specific user reviews from public and private repositories. In the tokenization phase, ChatReview applies sentiment analysis to extract keywords and categorize them into various sentiment classes. This process creates a token repository that best describes the user sentiments for a given user-review data. In the query construction phase, the framework uses the token repository and domain knowledge to construct three types of ChatGPT prompts including explicit, implicit, and creative. In the response generation phase, ChatReview pipelines these prompts into ChatGPT to generate search results at varying levels of granularity. We analyze our framework using three real-world domains including education, local restaurants, and hospitality. We assert that our framework simplifies prompt engineering for general users to produce effective results while minimizing the exposure of sensitive user data to search engines. We also present a one-of-a-kind Large Language Model (LLM) peer assessment of the ChatReview framework. Specifically, we employ Google’s Bard to objectively and qualitatively analyze the various ChatReview outputs. Our Bard-based analyses yield over 90% satisfaction, establishing ChatReview as a viable survey analysis tool.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"15 ","pages":"Article 100522"},"PeriodicalIF":4.9000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827023000750/pdfft?md5=82dd36b16ed5d43b7a9134111f9ce072&pid=1-s2.0-S2666827023000750-main.pdf","citationCount":"0","resultStr":"{\"title\":\"ChatReview: A ChatGPT-enabled natural language processing framework to study domain-specific user reviews\",\"authors\":\"Brittany Ho, Ta’Rhonda Mayberry, Khanh Linh Nguyen, Manohar Dhulipala, Vivek Krishnamani Pallipuram\",\"doi\":\"10.1016/j.mlwa.2023.100522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Intelligent search engines including pre-trained generative transformers (GPT) have revolutionized the user search experience. Several fields including e-commerce, education, and hospitality are increasingly exploring GPT tools to study user reviews and gain critical insights to improve their service quality. However, massive user-review data and imprecise prompt engineering lead to biased, irrelevant, and impersonal search results. In addition, exposing user data to these search engines may pose privacy issues. Motivated by these factors, we present ChatReview, a ChatGPT-enabled natural language processing (NLP) framework that effectively studies domain-specific user reviews to offer relevant and personalized search results at multiple levels of granularity. The framework accomplishes this task using four phases including data collection, tokenization, query construction, and response generation. The data collection phase involves gathering domain-specific user reviews from public and private repositories. In the tokenization phase, ChatReview applies sentiment analysis to extract keywords and categorize them into various sentiment classes. This process creates a token repository that best describes the user sentiments for a given user-review data. In the query construction phase, the framework uses the token repository and domain knowledge to construct three types of ChatGPT prompts including explicit, implicit, and creative. In the response generation phase, ChatReview pipelines these prompts into ChatGPT to generate search results at varying levels of granularity. We analyze our framework using three real-world domains including education, local restaurants, and hospitality. We assert that our framework simplifies prompt engineering for general users to produce effective results while minimizing the exposure of sensitive user data to search engines. We also present a one-of-a-kind Large Language Model (LLM) peer assessment of the ChatReview framework. Specifically, we employ Google’s Bard to objectively and qualitatively analyze the various ChatReview outputs. Our Bard-based analyses yield over 90% satisfaction, establishing ChatReview as a viable survey analysis tool.</p></div>\",\"PeriodicalId\":74093,\"journal\":{\"name\":\"Machine learning with applications\",\"volume\":\"15 \",\"pages\":\"Article 100522\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2023-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666827023000750/pdfft?md5=82dd36b16ed5d43b7a9134111f9ce072&pid=1-s2.0-S2666827023000750-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Machine learning with applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666827023000750\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827023000750","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

包括预训练生成变换器（GPT）在内的智能搜索引擎彻底改变了用户的搜索体验。包括电子商务、教育和酒店业在内的多个领域正越来越多地探索使用 GPT 工具来研究用户评论并获得重要见解，从而提高服务质量。然而，海量的用户评论数据和不精确的提示工程会导致搜索结果出现偏差、不相关和不人性化。此外，向这些搜索引擎公开用户数据可能会带来隐私问题。在这些因素的推动下，我们提出了一个支持 ChatGPT 的自然语言处理（NLP）框架--ChatReview，它能有效地研究特定领域的用户评论，从而在多个粒度级别上提供相关的个性化搜索结果。该框架通过数据收集、标记化、查询构建和响应生成等四个阶段来完成这项任务。数据收集阶段包括从公共和私人资源库中收集特定领域的用户评论。在标记化阶段，ChatReview 应用情感分析来提取关键词，并将其归类为各种情感类别。这一过程创建了一个标记库，能最好地描述给定用户评论数据的用户情感。在查询构建阶段，该框架使用标记库和领域知识构建三种类型的 ChatGPT 提示，包括显式、隐式和创意提示。在生成回复阶段，ChatReview 将这些提示导入 ChatGPT，生成不同粒度的搜索结果。我们使用三个真实世界领域分析了我们的框架，包括教育、本地餐馆和酒店。我们认为，我们的框架简化了一般用户的提示工程，从而产生了有效的结果，同时最大限度地减少了向搜索引擎暴露敏感用户数据的情况。我们还对 ChatReview 框架进行了独一无二的大语言模型（LLM）同行评估。具体来说，我们利用谷歌的 Bard 对 ChatReview 的各种输出结果进行了客观的定性分析。我们基于 Bard 的分析获得了 90% 以上的满意度，从而将 ChatReview 确立为一种可行的调查分析工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ChatReview: A ChatGPT-enabled natural language processing framework to study domain-specific user reviews

Intelligent search engines including pre-trained generative transformers (GPT) have revolutionized the user search experience. Several fields including e-commerce, education, and hospitality are increasingly exploring GPT tools to study user reviews and gain critical insights to improve their service quality. However, massive user-review data and imprecise prompt engineering lead to biased, irrelevant, and impersonal search results. In addition, exposing user data to these search engines may pose privacy issues. Motivated by these factors, we present ChatReview, a ChatGPT-enabled natural language processing (NLP) framework that effectively studies domain-specific user reviews to offer relevant and personalized search results at multiple levels of granularity. The framework accomplishes this task using four phases including data collection, tokenization, query construction, and response generation. The data collection phase involves gathering domain-specific user reviews from public and private repositories. In the tokenization phase, ChatReview applies sentiment analysis to extract keywords and categorize them into various sentiment classes. This process creates a token repository that best describes the user sentiments for a given user-review data. In the query construction phase, the framework uses the token repository and domain knowledge to construct three types of ChatGPT prompts including explicit, implicit, and creative. In the response generation phase, ChatReview pipelines these prompts into ChatGPT to generate search results at varying levels of granularity. We analyze our framework using three real-world domains including education, local restaurants, and hospitality. We assert that our framework simplifies prompt engineering for general users to produce effective results while minimizing the exposure of sensitive user data to search engines. We also present a one-of-a-kind Large Language Model (LLM) peer assessment of the ChatReview framework. Specifically, we employ Google’s Bard to objectively and qualitatively analyze the various ChatReview outputs. Our Bard-based analyses yield over 90% satisfaction, establishing ChatReview as a viable survey analysis tool.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications

自引率

0.00%

发文量

审稿时长

98 days