Spatiotemporal Analysis of Traffic Data: Correspondence Analysis with Fuzzified Variables vs. Principal Component Analysis Using Weather and Gas Price as Extra Data

Pierre Loslever
{"title":"Spatiotemporal Analysis of Traffic Data: Correspondence Analysis with Fuzzified Variables vs. Principal Component Analysis Using Weather and Gas Price as Extra Data","authors":"Pierre Loslever","doi":"10.1007/s11067-024-09624-4","DOIUrl":null,"url":null,"abstract":"<p>Study of large rail traffic databases presents formidable challenges for transport system specialists, more particularly while keeping both space and time factors together with the possibility of showing influencing factors related to the users and the transport network environment. To perform such a study, a bibliographic analysis in both statistics and transport revealed that geometrical methods for feature extraction and dimension reduction can be seen as suitable. Since there are several methods/options with, in principle, required input data, this article aims at comparing Principal Component Analysis (PCA) and Correspondence Analysis (CA) for traffic frequency data, both methods being actually used with such data. The procedure stands as follows. First a grand matrix is built where the rows correspond to time windows and the columns to all the possible origin-destination links. Then this large frequency matrix is studied using PCA and CA. The next part of the procedure consists in studying the effects of influencing factors with the possibility of keeping the quantitative scales with PCA or using fuzzy segmentation with CA, the corresponding data being considered as supplementary column points. The procedure is applied on a rail transport network including 10 stations (one corresponding to the airport) and one-hour time windows for 4 months, the available influencing factors being the temperature, rain level and gas price. The comparative analysis shows that CA graphical outputs are more complicated than PCA ones, but reveal more specific results, e.g. the network user behavior related to the airport, while PCA mainly opposes link clusters with low vs. high frequencies. Fuzzy windowing performed using actual and simulated data reduces the loss of information when averaging, e.g. over time, and can show non-linear relational phenomena. The possibility of displaying new traffic data in real time is also considered.</p>","PeriodicalId":501141,"journal":{"name":"Networks and Spatial Economics","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Networks and Spatial Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11067-024-09624-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Study of large rail traffic databases presents formidable challenges for transport system specialists, more particularly while keeping both space and time factors together with the possibility of showing influencing factors related to the users and the transport network environment. To perform such a study, a bibliographic analysis in both statistics and transport revealed that geometrical methods for feature extraction and dimension reduction can be seen as suitable. Since there are several methods/options with, in principle, required input data, this article aims at comparing Principal Component Analysis (PCA) and Correspondence Analysis (CA) for traffic frequency data, both methods being actually used with such data. The procedure stands as follows. First a grand matrix is built where the rows correspond to time windows and the columns to all the possible origin-destination links. Then this large frequency matrix is studied using PCA and CA. The next part of the procedure consists in studying the effects of influencing factors with the possibility of keeping the quantitative scales with PCA or using fuzzy segmentation with CA, the corresponding data being considered as supplementary column points. The procedure is applied on a rail transport network including 10 stations (one corresponding to the airport) and one-hour time windows for 4 months, the available influencing factors being the temperature, rain level and gas price. The comparative analysis shows that CA graphical outputs are more complicated than PCA ones, but reveal more specific results, e.g. the network user behavior related to the airport, while PCA mainly opposes link clusters with low vs. high frequencies. Fuzzy windowing performed using actual and simulated data reduces the loss of information when averaging, e.g. over time, and can show non-linear relational phenomena. The possibility of displaying new traffic data in real time is also considered.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
交通数据的时空分析:使用模糊变量的对应分析与使用天气和天然气价格作为额外数据的主成分分析
研究大型铁路交通数据库对交通系统专家来说是一项艰巨的挑战,尤其是要同时考虑空间和时间因素,以及显示与用户和交通网络环境有关的影响因素的可能性。为了进行这样的研究,统计和交通领域的文献分析表明,可以采用几何方法进行特征提取和维度缩减。由于有多种方法/选项,原则上都需要输入数据,本文旨在比较交通频率数据的主成分分析法(PCA)和对应分析法(CA),这两种方法实际上都用于此类数据。具体过程如下。首先建立一个大矩阵,其中行对应于时间窗口,列对应于所有可能的出发地-目的地链接。然后使用 PCA 和 CA 对这个大频率矩阵进行研究。程序的下一部分是研究影响因素的作用,可以使用 PCA 保持定量标度,也可以使用 CA 进行模糊分段,相应的数据被视为补充列点。该程序适用于一个铁路交通网络,包括 10 个车站(其中一个与机场相对应)和 4 个月的一小时时间窗口,可用的影响因素为气温、雨量和天然气价格。对比分析表明,CA 图形输出比 PCA 图形输出更复杂,但揭示的结果更具体,例如,与机场相关的网络用户行为,而 PCA 主要反对低频率与高频率的链接集群。使用实际数据和模拟数据进行模糊窗口分析可减少平均化时的信息损失,例如随时间变化的信息损失,并可显示非线性关系现象。此外,还考虑了实时显示新交通数据的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Meal Delivery Routing Problem with Stochastic Meal Preparation Times and Customer Locations Dynamic Pricing Analysis under Demand-Supply Equilibrium of Autonomous-Mobility-on-Demand Services From traditional to digital servicification: Chinese services in European manufacturing Fulfillment Center Location and Network Design in Dual-Channel Retailing Node Coincidence in Metric Minimum Weighted Length Graph Embeddings
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1