Predicting and understanding travellers’ mode choices is crucial to developing urban transportation systems and formulating traffic demand management strategies. Machine learning (ML) methods have been widely used as promising alternatives to traditional discrete choice models owing to their high prediction accuracy. However, a significant body of ML methods, especially the branch of neural networks, is constrained by overfitting and a lack of model interpretability. This study employs a neural network with feature selection for predicting travel mode choices and Shapley additive explanations (SHAP) analysis for model interpretation. A dataset collected in Chengdu, China was used for experimentation. The results reveal that the neural network achieves commendable prediction performance, with a 12% improvement over the traditional multinomial logit model. Also, feature selection using a combined result from two embedded methods can alleviate the overfitting tendency of the neural network, while establishing a more robust model against redundant or unnecessary variables. Additionally, the SHAP analysis identifies factors such as travel expenditure, age, driving experience, number of cars owned, individual monthly income, and trip purpose as significant features in our dataset. The heterogeneity of mode choice behaviour is significant among demographic groups, including different age, car ownership, and income levels.
预测和了解旅行者的模式选择对于开发城市交通系统和制定交通需求管理策略至关重要。机器学习(ML)方法因其预测准确性高而被广泛应用,有望替代传统的离散选择模型。然而,大量的 ML 方法,尤其是神经网络分支,都受到过度拟合和缺乏模型可解释性的限制。本研究采用带有特征选择的神经网络来预测出行方式选择,并采用夏普利加法解释(SHAP)分析来解释模型。实验使用了在中国成都收集的数据集。结果表明,神经网络的预测性能值得称赞,比传统的多二项对数模型提高了 12%。同时,利用两种嵌入方法的综合结果进行特征选择,可以缓解神经网络的过拟合趋势,同时建立一个更稳健的模型,避免冗余或不必要的变量。此外,SHAP 分析还确定了旅行支出、年龄、驾驶经验、拥有汽车数量、个人月收入和旅行目的等因素是我们数据集中的重要特征。在不同的人口群体中,包括不同年龄、汽车拥有量和收入水平在内,模式选择行为的异质性非常明显。
{"title":"Predicting travel mode choice with a robust neural network and Shapley additive explanations analysis","authors":"Li Tang, Chuanli Tang, Qi Fu, Changxi Ma","doi":"10.1049/itr2.12514","DOIUrl":"https://doi.org/10.1049/itr2.12514","url":null,"abstract":"<p>Predicting and understanding travellers’ mode choices is crucial to developing urban transportation systems and formulating traffic demand management strategies. Machine learning (ML) methods have been widely used as promising alternatives to traditional discrete choice models owing to their high prediction accuracy. However, a significant body of ML methods, especially the branch of neural networks, is constrained by overfitting and a lack of model interpretability. This study employs a neural network with feature selection for predicting travel mode choices and Shapley additive explanations (SHAP) analysis for model interpretation. A dataset collected in Chengdu, China was used for experimentation. The results reveal that the neural network achieves commendable prediction performance, with a 12% improvement over the traditional multinomial logit model. Also, feature selection using a combined result from two embedded methods can alleviate the overfitting tendency of the neural network, while establishing a more robust model against redundant or unnecessary variables. Additionally, the SHAP analysis identifies factors such as travel expenditure, age, driving experience, number of cars owned, individual monthly income, and trip purpose as significant features in our dataset. The heterogeneity of mode choice behaviour is significant among demographic groups, including different age, car ownership, and income levels.</p>","PeriodicalId":50381,"journal":{"name":"IET Intelligent Transport Systems","volume":"18 7","pages":"1339-1354"},"PeriodicalIF":2.3,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/itr2.12514","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141556542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent years have witnessed the proliferation of traffic accidents, which led wide researches on automated vehicle (AV) technologies to reduce vehicle accidents, especially on risk assessment framework of AV technologies. However, existing time-based frameworks cannot handle complex traffic scenarios and ignore the motion tendency influence of each moving objects on the risk distribution, leading to performance degradation. To address this problem, a comprehensive driving risk management framework named RCP-RF is novelly proposed based on potential field theory under connected and automated vehicles environment, where the pedestrian risk metric is combined into a unified road-vehicle driving risk management framework. Different from existing algorithms, the motion tendency between ego and obstacle cars and the pedestrian factor are legitimately considered in the proposed framework, which can improve the performance of the driving risk model. Moreover, it requires only