{"title":"Facilitating topic modeling in tourism research:Comprehensive comparison of new AI technologies","authors":"Andrei P. Kirilenko, Svetlana Stepchenkova","doi":"10.1016/j.tourman.2024.105007","DOIUrl":null,"url":null,"abstract":"<div><p>In the past few years, a new crop of transformer-based language models such as Google's BERT and OpenAI's ChatGPT has become increasingly popular in text analysis, owing their success to their ability to capture the entire document's context. These new methods, however, have yet to percolate into tourism academic literature. This paper aims to fill in this gap by providing a comparative analysis of these instruments against the commonly used Latent Dirichlet Allocation for topic extraction of contrasting tourism-related data: coherent vs. noisy, short vs. long, and small vs. large corpus size. The data are typical of tourism literature and include comments of followers of a popular blogger, TripAdvisor reviews, and review titles. We provide recommendations of data domains where the review methods demonstrate the best performance, consider success dimensions, and discuss each method's strong and weak sides. In general, GPT tends to return comprehensive, highly interpretable, and relevant to the real-world topics for all datasets, including the noisy ones, and at all scales. Meanwhile, ChatGPT is the most vulnerable to the issue of trust common to the “black box” model, which we explore in detail.</p></div>","PeriodicalId":48469,"journal":{"name":"Tourism Management","volume":"106 ","pages":"Article 105007"},"PeriodicalIF":10.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0261517724001262/pdfft?md5=9cf4802a7c4ae0637a244cf2391ccfb1&pid=1-s2.0-S0261517724001262-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tourism Management","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0261517724001262","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}
引用次数: 0
Abstract
In the past few years, a new crop of transformer-based language models such as Google's BERT and OpenAI's ChatGPT has become increasingly popular in text analysis, owing their success to their ability to capture the entire document's context. These new methods, however, have yet to percolate into tourism academic literature. This paper aims to fill in this gap by providing a comparative analysis of these instruments against the commonly used Latent Dirichlet Allocation for topic extraction of contrasting tourism-related data: coherent vs. noisy, short vs. long, and small vs. large corpus size. The data are typical of tourism literature and include comments of followers of a popular blogger, TripAdvisor reviews, and review titles. We provide recommendations of data domains where the review methods demonstrate the best performance, consider success dimensions, and discuss each method's strong and weak sides. In general, GPT tends to return comprehensive, highly interpretable, and relevant to the real-world topics for all datasets, including the noisy ones, and at all scales. Meanwhile, ChatGPT is the most vulnerable to the issue of trust common to the “black box” model, which we explore in detail.
期刊介绍:
Tourism Management, the preeminent scholarly journal, concentrates on the comprehensive management aspects, encompassing planning and policy, within the realm of travel and tourism. Adopting an interdisciplinary perspective, the journal delves into international, national, and regional tourism, addressing various management challenges. Its content mirrors this integrative approach, featuring primary research articles, progress in tourism research, case studies, research notes, discussions on current issues, and book reviews. Emphasizing scholarly rigor, all published papers are expected to contribute to theoretical and/or methodological advancements while offering specific insights relevant to tourism management and policy.