Daily activity-travel pattern identification using natural language processing and semantic matching

IF 5.7 2区 工程技术 Q1 ECONOMICS Journal of Transport Geography Pub Date : 2024-11-21 DOI:10.1016/j.jtrangeo.2024.104057
Suchismita Nayak, Debapratim Pandit
{"title":"Daily activity-travel pattern identification using natural language processing and semantic matching","authors":"Suchismita Nayak,&nbsp;Debapratim Pandit","doi":"10.1016/j.jtrangeo.2024.104057","DOIUrl":null,"url":null,"abstract":"<div><div>The generation of daily activity patterns (DAPs) has gained considerable attention due to its capacity to capture the interdependencies among activities and underlying behavioural dynamics. Existing clustering methods often face limitations related to the aggregation of heterogeneous DAPs, leading to reduction in prediction accuracy. This study presents a novel hybrid approach that integrates “direct activity sequence” recognition with a sequence matching algorithm grounded in Natural Language Processing (NLP) techniques. Unlike traditional methods, our approach preserves the distinct identity of each activity sequence, assigning DAPs based on the frequency of distinct sequences within each representative DAP group. This process is further enhanced by a hierarchical activity categorization structure, enabling deeper exploration of in-home activities, household interaction effects, and spatial changes between activity types. Additionally, the introduction of weighted activity categories and a match score calculation system opens new possibilities for future sequence alignment methodologies. Using data from 1808 households (approximately 6500 individuals) in Bidhannagar, India, we demonstrate that our approach outperforms traditional methods in terms of prediction accuracy. This study also explores the impact of evolving patterns of online activities, socio-economic heterogeneity, built-up area and neighbourhood characteristics, distinction between weekday and weekend on DAP prediction in the context of emerging countries.</div></div>","PeriodicalId":48413,"journal":{"name":"Journal of Transport Geography","volume":"122 ","pages":"Article 104057"},"PeriodicalIF":5.7000,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Transport Geography","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0966692324002667","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0

Abstract

The generation of daily activity patterns (DAPs) has gained considerable attention due to its capacity to capture the interdependencies among activities and underlying behavioural dynamics. Existing clustering methods often face limitations related to the aggregation of heterogeneous DAPs, leading to reduction in prediction accuracy. This study presents a novel hybrid approach that integrates “direct activity sequence” recognition with a sequence matching algorithm grounded in Natural Language Processing (NLP) techniques. Unlike traditional methods, our approach preserves the distinct identity of each activity sequence, assigning DAPs based on the frequency of distinct sequences within each representative DAP group. This process is further enhanced by a hierarchical activity categorization structure, enabling deeper exploration of in-home activities, household interaction effects, and spatial changes between activity types. Additionally, the introduction of weighted activity categories and a match score calculation system opens new possibilities for future sequence alignment methodologies. Using data from 1808 households (approximately 6500 individuals) in Bidhannagar, India, we demonstrate that our approach outperforms traditional methods in terms of prediction accuracy. This study also explores the impact of evolving patterns of online activities, socio-economic heterogeneity, built-up area and neighbourhood characteristics, distinction between weekday and weekend on DAP prediction in the context of emerging countries.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用自然语言处理和语义匹配识别日常活动--旅行模式
日常活动模式(DAPs)能够捕捉活动之间的相互依存关系和潜在的行为动态,因此其生成受到了广泛关注。现有的聚类方法往往面临与异构 DAPs 聚合有关的限制,导致预测准确性降低。本研究提出了一种新颖的混合方法,将 "直接活动序列 "识别与基于自然语言处理(NLP)技术的序列匹配算法相结合。与传统方法不同的是,我们的方法保留了每个活动序列的独特性,根据每个代表性 DAP 组中不同序列的频率来分配 DAP。分层活动分类结构进一步加强了这一过程,从而能够更深入地探索居家活动、家庭互动效应以及活动类型之间的空间变化。此外,加权活动类别和匹配分数计算系统的引入为未来的序列比对方法提供了新的可能性。通过使用来自印度比德汉纳加尔 1808 个家庭(约 6500 人)的数据,我们证明了我们的方法在预测准确性方面优于传统方法。本研究还探讨了新兴国家不断变化的在线活动模式、社会经济异质性、建成区和邻里特征、工作日和周末的区别对 DAP 预测的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
11.50
自引率
11.50%
发文量
197
期刊介绍: A major resurgence has occurred in transport geography in the wake of political and policy changes, huge transport infrastructure projects and responses to urban traffic congestion. The Journal of Transport Geography provides a central focus for developments in this rapidly expanding sub-discipline.
期刊最新文献
Editorial Board What determines travel time and distance decay in spatial interaction and accessibility? Development and application of an optimization model to evaluate future charging demand for long-haul electric vehicles in Ontario, Canada Coverage vs frequency: Is spatial coverage or temporal frequency more impactful on transit ridership? Daily activity-travel pattern identification using natural language processing and semantic matching
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1