Analyzing fuzzy semantics of reviews for multi-criteria recommendations

IF 2.7 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data & Knowledge Engineering Pub Date : 2024-05-16 DOI:10.1016/j.datak.2024.102314
Navreen Kaur Boparai , Himanshu Aggarwal , Rinkle Rani
{"title":"Analyzing fuzzy semantics of reviews for multi-criteria recommendations","authors":"Navreen Kaur Boparai ,&nbsp;Himanshu Aggarwal ,&nbsp;Rinkle Rani","doi":"10.1016/j.datak.2024.102314","DOIUrl":null,"url":null,"abstract":"<div><p>Hotel reviews play a vital role in tourism recommender system. They should be analyzed effectively to enhance the accuracy of recommendations which can be generated either from crisp ratings on a fixed scale or real sentiments of reviews. But crisp ratings cannot represent the actual feelings of reviewers. Existing tourism recommender systems mostly recommend hotels on the basis of vague and sparse ratings resulting in inaccurate recommendations or preferences for online users. This paper presents a semantic approach to analyze the online reviews being crawled from tripadvisor.in. It discovers the underlying fuzzy semantics of reviews with respect to the multiple criteria of hotels rather than using the crisp ratings. The crawled reviews are preprocessed via data cleaning such as stopword and punctuation removal, tokenization, lemmatization, pos tagging to understand the semantics efficiently. Nouns representing frequent features of hotels are extracted from pre-processed reviews which are further used to identify opinion phrases. Fuzzy weights are derived from normalized frequency of frequent nouns and combined with sentiment score of all the synonyms of adjectives in the identified opinion phrases. This results in fuzzy semantics which form an ideal representation of reviews for a multi-criteria tourism recommender system. The proposed work is implemented in python by crawling the recent reviews of Jaipur hotels from TripAdvisor and analyzing their semantics. The resultant fuzzy semantics form a manually tagged dataset of reviews tagged with sentiments of identified aspects, respectively. Experimental results show improved sentiment score while considering all the synonyms of adjectives. The results are further used to fine-tune BERT models to form encodings for a query-based recommender system. The proposed approach can help tourism and hospitality service providers to take advantage of such sentiment analysis to examine the negative comments or unpleasant experiences of tourists and making appropriate improvements. Moreover, it will help online users to get better recommendations while planning their trips.</p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X24000387","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Hotel reviews play a vital role in tourism recommender system. They should be analyzed effectively to enhance the accuracy of recommendations which can be generated either from crisp ratings on a fixed scale or real sentiments of reviews. But crisp ratings cannot represent the actual feelings of reviewers. Existing tourism recommender systems mostly recommend hotels on the basis of vague and sparse ratings resulting in inaccurate recommendations or preferences for online users. This paper presents a semantic approach to analyze the online reviews being crawled from tripadvisor.in. It discovers the underlying fuzzy semantics of reviews with respect to the multiple criteria of hotels rather than using the crisp ratings. The crawled reviews are preprocessed via data cleaning such as stopword and punctuation removal, tokenization, lemmatization, pos tagging to understand the semantics efficiently. Nouns representing frequent features of hotels are extracted from pre-processed reviews which are further used to identify opinion phrases. Fuzzy weights are derived from normalized frequency of frequent nouns and combined with sentiment score of all the synonyms of adjectives in the identified opinion phrases. This results in fuzzy semantics which form an ideal representation of reviews for a multi-criteria tourism recommender system. The proposed work is implemented in python by crawling the recent reviews of Jaipur hotels from TripAdvisor and analyzing their semantics. The resultant fuzzy semantics form a manually tagged dataset of reviews tagged with sentiments of identified aspects, respectively. Experimental results show improved sentiment score while considering all the synonyms of adjectives. The results are further used to fine-tune BERT models to form encodings for a query-based recommender system. The proposed approach can help tourism and hospitality service providers to take advantage of such sentiment analysis to examine the negative comments or unpleasant experiences of tourists and making appropriate improvements. Moreover, it will help online users to get better recommendations while planning their trips.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
为多标准推荐分析评论的模糊语义
酒店评论在旅游推荐系统中起着至关重要的作用。应有效地分析这些评论,以提高推荐的准确性,而推荐的准确性可以通过固定比例的清晰评分或评论的真实情感来生成。但清晰的评分并不能代表评论者的真实感受。现有的旅游推荐系统大多根据模糊和稀疏的评分来推荐酒店,结果导致对在线用户的推荐或偏好不准确。本文提出了一种语义方法来分析从 tripadvisor.in 抓取的在线评论。它能根据酒店的多种标准发现评论的基本模糊语义,而不是使用清晰的评分。抓取到的评论会经过数据清理预处理,如删除停顿词和标点符号、标记化、词法化、pos 标记等,以便有效地理解语义。从预处理后的评论中提取代表酒店常见特征的名词,并进一步用于识别意见短语。根据频繁出现的名词的归一化频率得出模糊权重,并结合已识别意见短语中所有形容词同义词的情感得分。这就形成了模糊语义,为多标准旅游推荐系统提供了理想的评论表示。通过从 TripAdvisor 抓取斋浦尔酒店最近的评论并分析其语义,用 python 实现了提议的工作。由此产生的模糊语义形成了一个人工标记的评论数据集,分别标记了已识别方面的情感。实验结果表明,在考虑所有形容词同义词的情况下,情感评分有所提高。实验结果进一步用于微调 BERT 模型,为基于查询的推荐系统形成编码。所提出的方法可以帮助旅游和酒店服务提供商利用这种情感分析来检查游客的负面评论或不愉快经历,并做出适当的改进。此外,它还能帮助在线用户在规划旅行时获得更好的推荐。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data & Knowledge Engineering
Data & Knowledge Engineering 工程技术-计算机:人工智能
CiteScore
5.00
自引率
0.00%
发文量
66
审稿时长
6 months
期刊介绍: Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.
期刊最新文献
Reasoning on responsibilities for optimal process alignment computation SRank: Guiding schema selection in NoSQL document stores Relating behaviour of data-aware process models A framework for understanding event abstraction problem solving: Current states of event abstraction studies A conceptual framework for the government big data ecosystem (‘datagov.eco’)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1