Learning from user behavior: A survey-assist algorithm for longitudinal mobility data collection

IF 5.1 2区 工程技术 Q1 TRANSPORTATION Travel Behaviour and Society Pub Date : 2024-03-22 DOI:10.1016/j.tbs.2024.100761
Hannah Lu, Katie Rischpater, K. Shankari
{"title":"Learning from user behavior: A survey-assist algorithm for longitudinal mobility data collection","authors":"Hannah Lu,&nbsp;Katie Rischpater,&nbsp;K. Shankari","doi":"10.1016/j.tbs.2024.100761","DOIUrl":null,"url":null,"abstract":"<div><p>GPS-based travel surveys are widely used in mobility studies to gather crucial qualitative data, like purpose, transportation mode and replaced mode. However, survey response still poses a burden to users, especially in long-term mobility studies, leading to response fatigue. We explore a survey-assist strategy to ease this burden by a novel, user-level modeling approach that leverages past responses from each user to predict responses for new trips, without relying on external data sources like GIS data.</p><p>We investigate three main algorithms for predicting responses: (i) clustering trips and extrapolating responses for similar trips, (ii) using random forest classification, and (iii) clustering that uses a hybrid algorithm to determine spatial structure, which is then fed as input to a classic random forest classifier. The clustering approach can flexibly predict responses for even complex qualitative survey questions; it achieved F-scores of 65%. The random forest pipeline uses architecture that restricts it to predicting three predetermined survey questions: trip purpose, mode, and replaced mode. However, it achieved F-scores of 78%.</p><p>While the survey-assist approach has been implemented by several proprietary systems, to our knowledge, this is the first exploration in the academic literature. It follows that this is also the first rigorous evaluation of multiple algorithms that can implement the approach. The evaluation uses a large scale, publicly available, longitudinal dataset consisting of <span><math><mrow><mo>≈</mo></mrow></math></span> 92 k trips from 235 users over a period of roughly one and a half years.</p><p>With this approach, travel surveys can be pre-filled with the predicted responses for each trip, thus streamlining the survey process for users. Combined with an active learning system that requests user input on low-confidence predictions, models can be updated and improved over time to better support the long-term collection of longitudinal qualitative data.</p></div>","PeriodicalId":51534,"journal":{"name":"Travel Behaviour and Society","volume":null,"pages":null},"PeriodicalIF":5.1000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Travel Behaviour and Society","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214367X24000243","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0

Abstract

GPS-based travel surveys are widely used in mobility studies to gather crucial qualitative data, like purpose, transportation mode and replaced mode. However, survey response still poses a burden to users, especially in long-term mobility studies, leading to response fatigue. We explore a survey-assist strategy to ease this burden by a novel, user-level modeling approach that leverages past responses from each user to predict responses for new trips, without relying on external data sources like GIS data.

We investigate three main algorithms for predicting responses: (i) clustering trips and extrapolating responses for similar trips, (ii) using random forest classification, and (iii) clustering that uses a hybrid algorithm to determine spatial structure, which is then fed as input to a classic random forest classifier. The clustering approach can flexibly predict responses for even complex qualitative survey questions; it achieved F-scores of 65%. The random forest pipeline uses architecture that restricts it to predicting three predetermined survey questions: trip purpose, mode, and replaced mode. However, it achieved F-scores of 78%.

While the survey-assist approach has been implemented by several proprietary systems, to our knowledge, this is the first exploration in the academic literature. It follows that this is also the first rigorous evaluation of multiple algorithms that can implement the approach. The evaluation uses a large scale, publicly available, longitudinal dataset consisting of 92 k trips from 235 users over a period of roughly one and a half years.

With this approach, travel surveys can be pre-filled with the predicted responses for each trip, thus streamlining the survey process for users. Combined with an active learning system that requests user input on low-confidence predictions, models can be updated and improved over time to better support the long-term collection of longitudinal qualitative data.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从用户行为中学习:用于纵向移动数据收集的调查辅助算法
基于 GPS 的出行调查被广泛应用于流动性研究,以收集关键的定性数据,如出行目的、交通方式和替代模式。然而,调查回复仍然给用户带来了负担,尤其是在长期流动性研究中,这会导致回复疲劳。我们探索了一种调查辅助策略,通过一种新颖的用户级建模方法来减轻这种负担,该方法利用每个用户过去的回复来预测新出行的回复,而无需依赖地理信息系统数据等外部数据源。我们研究了预测回复的三种主要算法:(i) 对出行进行聚类,并推断类似出行的回复;(ii) 使用随机森林分类法;(iii) 使用混合算法确定空间结构的聚类,然后将其作为经典随机森林分类器的输入。即使是复杂的定性调查问题,聚类方法也能灵活预测答案;其 F 分数达到 65%。随机森林管道使用的架构限制了它预测三个预先确定的调查问题:旅行目的、模式和替换模式。据我们所知,这是学术文献中的首次探索。据我们所知,这是首次在学术文献中进行探讨,因此这也是首次对可以实现该方法的多种算法进行严格评估。评估使用了一个大规模、公开的纵向数据集,该数据集由 235 名用户在大约一年半的时间内的≈ 92 k 次旅行组成。使用这种方法,旅行调查可以预先填写每次旅行的预测回复,从而简化用户的调查流程。结合主动学习系统(该系统要求用户对低置信度预测进行输入),模型可以随着时间的推移不断更新和改进,从而更好地支持纵向定性数据的长期收集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
9.80
自引率
7.70%
发文量
109
期刊介绍: Travel Behaviour and Society is an interdisciplinary journal publishing high-quality original papers which report leading edge research in theories, methodologies and applications concerning transportation issues and challenges which involve the social and spatial dimensions. In particular, it provides a discussion forum for major research in travel behaviour, transportation infrastructure, transportation and environmental issues, mobility and social sustainability, transportation geographic information systems (TGIS), transportation and quality of life, transportation data collection and analysis, etc.
期刊最新文献
Latent class approach to classify e-scooter non-users: A comparative study of Helsinki and Tokyo Q_EDQ: Efficient path planning in multimodal travel scenarios based on reinforcement learning Analysis of emotions of online car-hailing drivers under different driving conditions and scenarios Augmenting last-mile connectivity with multimodal transport: Do choice riders favor integrated bike taxi-bus service in metro cities? New insights into factors affecting the severity of autonomous vehicle crashes from two sources of AV incident records
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1