A supervised machine learning model for imputing missing boarding stops in smart card data.

IF 2.3 Q2 TRANSPORTATION SCIENCE & TECHNOLOGY Public Transport Pub Date : 2023-01-01 Epub Date: 2022-12-07 DOI:10.1007/s12469-022-00309-0
Nadav Shalit, Michael Fire, Eran Ben-Elia
{"title":"A supervised machine learning model for imputing missing boarding stops in smart card data.","authors":"Nadav Shalit, Michael Fire, Eran Ben-Elia","doi":"10.1007/s12469-022-00309-0","DOIUrl":null,"url":null,"abstract":"<p><p>Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.</p>","PeriodicalId":46539,"journal":{"name":"Public Transport","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9734418/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Public Transport","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s12469-022-00309-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/12/7 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Public transport has become an essential part of urban existence with increased population densities and environmental awareness. Large quantities of data are currently generated, allowing for more robust methods to understand travel behavior by harvesting smart card usage. However, public transport datasets suffer from data integrity problems; boarding stop information may be missing due to imperfect acquirement processes or inadequate reporting. This study introduces a supervised machine learning method to impute missing boarding stops based on ordinal classification using GTFS timetable, smart card, and geospatial datasets. A new metric, Pareto Accuracy, is suggested to evaluate algorithms where classes have an ordinal nature. The results are based on a case study in the city of Beer Sheva, Israel, consisting of one month of smart card data. We show that our proposed method is robust to irregular travelers and significantly outperforms well-known imputation methods without the need to mine any additional datasets. The data validation from another Israeli city using transfer learning shows the presented model is general and context-free. The implications for transportation planning and travel behavior research are further discussed.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于推算智能卡数据中缺失的登机站的监督机器学习模型。
随着人口密度和环保意识的提高,公共交通已成为城市生活的重要组成部分。目前已产生了大量数据,通过收集智能卡的使用情况,可以采用更强大的方法来了解人们的出行行为。然而,公共交通数据集存在数据完整性问题;由于获取流程不完善或报告不充分,可能会丢失乘车站信息。本研究介绍了一种监督机器学习方法,利用 GTFS 时刻表、智能卡和地理空间数据集,在序数分类的基础上对缺失的乘车站进行估算。研究提出了一个新指标--帕累托准确率,用于评估具有序数性质的算法。结果基于以色列比尔谢瓦市的一项案例研究,包括一个月的智能卡数据。结果表明,我们提出的方法对不规则旅行者具有鲁棒性,并且明显优于众所周知的估算方法,而无需挖掘任何额外的数据集。使用迁移学习方法对以色列另一个城市的数据进行验证,结果表明所提出的模型具有通用性且不受限于上下文。我们还进一步讨论了该模型对交通规划和旅行行为研究的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Public Transport
Public Transport TRANSPORTATION SCIENCE & TECHNOLOGY-
CiteScore
5.40
自引率
15.40%
发文量
19
期刊介绍: The scope and purpose of the journal includes, but is not limited to, any type of research in the area of Public Transport: Planning and Operations. As its core it serves the primary mission of advancing the state of the art and the state of the practice in computer-aided systems and scheduling in public transport. The journal considers any type of subjects in this area especially with a focus to planning and scheduling, the common ground is the use of computer-aided methods and operations research techniques to improve information management, network and route planning, vehicle and crew scheduling and rostering, vehicle monitoring and management, and practical experience with scheduling and public transport planning methods. Besides theoretical papers, the journal also publishes case studies and applications. Public Transport addresses transport operators, consulting firms and academic institutions involved in development, utilization or research of computer-aided planning and scheduling in public transport.Officially cited as: Public Transp
期刊最新文献
The effect of locating public transit stations on their walking accessibility using an actual street network Contribution of built environment factors and their interactions with subway station ridership Origin–destination matrices from smartphone apps for bus networks A dynamic simulation model to improve the livability of transportation systems Traffic signal priority control for public transport rapid transit based on a step-by-step prediction algorithm
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1