Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida
{"title":"Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge","authors":"Lucas G.S. Félix, Washington Cunha, Claudio M.V. de Andrade, Marcos André Gonçalves, Jussara M. Almeida","doi":"10.1016/j.osnem.2024.100296","DOIUrl":null,"url":null,"abstract":"<div><div>This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.</div></div>","PeriodicalId":52228,"journal":{"name":"Online Social Networks and Media","volume":"45 ","pages":"Article 100296"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online Social Networks and Media","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468696424000211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
This paper addresses the task of inferring trip profiles (TPs), which consists of determining the profile of travelers engaged in a particular trip given a set of possible categories. TPs may include working trips, leisure journeys with friends, or family vacations. Travelers with different TPs typically have varied plans regarding destinations and timing. TP inference may provide significant insights for numerous tourism-related services, such as geo-recommender systems and tour planning. We focus on TP inference using TripAdvisor, a prominent tourism-centric social media platform, as our data source. Our goal is to evaluate how effectively we can automatically discern the TP from a user review on this platform. A user review encompasses both textual feedback and domain-specific data (such as a user’s previous visits to the location), which are crucial for accurately characterizing the trip. To achieve this, we assess various feature sets (including text and domain-specific) and implement advanced machine learning models, such as neural Transformers and open-source Large Language Models (Llama 2, Bloom). We examine two variants of the TP inference task—binary and multi-class. Surprisingly, our findings reveal that combining domain-specific features with TF-IDF-based representation in an LGBM model performs as well as more complex Transformer and LLM models, while being much more efficient and interpretable.