Comparing hundreds of machine learning and discrete choice models for travel demand modeling: An empirical benchmark

IF 5.8 1区工程技术 Q1 ECONOMICS Transportation Research Part B-Methodological Pub Date : 2024-09-12 DOI:10.1016/j.trb.2024.103061

Shenhao Wang , Baichuan Mo , Yunhan Zheng , Stephane Hess , Jinhua Zhao

{"title":"Comparing hundreds of machine learning and discrete choice models for travel demand modeling: An empirical benchmark","authors":"Shenhao Wang , Baichuan Mo , Yunhan Zheng , Stephane Hess , Jinhua Zhao","doi":"10.1016/j.trb.2024.103061","DOIUrl":null,"url":null,"abstract":"<div><p>Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model to learn the intrinsic predictive values of ML and DCMs. This novel approach enables us to efficiently summarize a large number of experiments, quantify the randomness in model comparisons, and use formal statistical tests to differentiate between the model and contextual effects. This benchmark study compares two large-scale data sources: a database compiled from literature review summarizing 136 experiments from 35 studies, and our own experiment data, encompassing a total of 6970 experiments from 105 models and 12 model families, tested repeatedly on three datasets, sample sizes, and choice categories. This benchmark study yields two key findings. Firstly, many ML models, particularly the ensemble methods and deep learning, statistically outperform the DCM family and its individual variants (i.e., multinomial, nested, and mixed logit), thus corroborating with the previous research. However, this study also highlights the crucial role of the contextual factors (i.e., data sources, inputs and choice categories), which can explain models’ predictive performance more effectively than the differences in model types alone. Model performance varies significantly with data sources, improving with larger sample sizes and lower dimensional alternative sets. After controlling all the model and contextual factors, significant randomness still remains, implying inherent uncertainty in such model comparisons. Overall, we suggest that future researchers shift more focus from context-specific and deterministic model comparisons towards examining model transferability across contexts and characterizing the inherent uncertainty in ML, thus creating more robust and generalizable next-generation travel demand models.</p></div>","PeriodicalId":54418,"journal":{"name":"Transportation Research Part B-Methodological","volume":"190 ","pages":"Article 103061"},"PeriodicalIF":5.8000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part B-Methodological","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0191261524001851","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

Numerous studies have compared machine learning (ML) and discrete choice models (DCMs) in predicting travel demand. However, these studies often lack generalizability as they compare models deterministically without considering contextual variations. To address this limitation, our study develops an empirical benchmark by designing a tournament model to learn the intrinsic predictive values of ML and DCMs. This novel approach enables us to efficiently summarize a large number of experiments, quantify the randomness in model comparisons, and use formal statistical tests to differentiate between the model and contextual effects. This benchmark study compares two large-scale data sources: a database compiled from literature review summarizing 136 experiments from 35 studies, and our own experiment data, encompassing a total of 6970 experiments from 105 models and 12 model families, tested repeatedly on three datasets, sample sizes, and choice categories. This benchmark study yields two key findings. Firstly, many ML models, particularly the ensemble methods and deep learning, statistically outperform the DCM family and its individual variants (i.e., multinomial, nested, and mixed logit), thus corroborating with the previous research. However, this study also highlights the crucial role of the contextual factors (i.e., data sources, inputs and choice categories), which can explain models’ predictive performance more effectively than the differences in model types alone. Model performance varies significantly with data sources, improving with larger sample sizes and lower dimensional alternative sets. After controlling all the model and contextual factors, significant randomness still remains, implying inherent uncertainty in such model comparisons. Overall, we suggest that future researchers shift more focus from context-specific and deterministic model comparisons towards examining model transferability across contexts and characterizing the inherent uncertainty in ML, thus creating more robust and generalizable next-generation travel demand models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

比较数百种用于旅行需求建模的机器学习和离散选择模型：实证基准

许多研究都对机器学习模型（ML）和离散选择模型（DCM）在预测旅行需求方面进行了比较。然而，这些研究往往缺乏普适性，因为它们只对模型进行确定性比较，而没有考虑环境变化。为了解决这一局限性，我们的研究通过设计一个锦标赛模型来学习 ML 和 DCM 的内在预测值，从而建立了一个经验基准。这种新颖的方法使我们能够有效地总结大量实验，量化模型比较中的随机性，并使用正式的统计检验来区分模型和背景效应。这项基准研究比较了两个大规模数据源：一个是根据文献综述编制的数据库，其中汇总了 35 项研究的 136 个实验；另一个是我们自己的实验数据，其中包括 105 个模型和 12 个模型族的共计 6970 个实验，这些实验在三个数据集、样本量和选择类别上进行了反复测试。这项基准研究有两个重要发现。首先，许多 ML 模型，尤其是集合方法和深度学习，在统计上优于 DCM 系列及其个别变体（即多项式、嵌套和混合 logit），从而与之前的研究相印证。不过，本研究也强调了背景因素（即数据来源、输入和选择类别）的关键作用，这些因素比单纯的模型类型差异能更有效地解释模型的预测性能。模型的性能随数据源的不同而有很大差异，样本量越大、替代集维度越低，模型的性能就越高。在控制了所有模型和背景因素后，仍存在明显的随机性，这意味着此类模型比较存在固有的不确定性。总之，我们建议未来的研究人员将更多的注意力从特定情境和确定性模型比较转移到研究模型在不同情境下的可转移性以及描述 ML 固有的不确定性，从而创建更稳健、更通用的下一代旅行需求模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Transportation Research Part B-Methodological 工程技术-工程：土木

CiteScore

12.40

自引率

8.80%

发文量

143

审稿时长

14.1 weeks

期刊介绍： Transportation Research: Part B publishes papers on all methodological aspects of the subject, particularly those that require mathematical analysis. The general theme of the journal is the development and solution of problems that are adequately motivated to deal with important aspects of the design and/or analysis of transportation systems. Areas covered include: traffic flow; design and analysis of transportation networks; control and scheduling; optimization; queuing theory; logistics; supply chains; development and application of statistical, econometric and mathematical models to address transportation problems; cost models; pricing and/or investment; traveler or shipper behavior; cost-benefit methodologies.