Enriched traffic datasets for the city of Madrid: Integrating data from traffic sensors, the road infrastructure, calendar data and weather data

IF 1 Q3 MULTIDISCIPLINARY SCIENCES Data in Brief Pub Date : 2024-08-29 DOI:10.1016/j.dib.2024.110878
{"title":"Enriched traffic datasets for the city of Madrid: Integrating data from traffic sensors, the road infrastructure, calendar data and weather data","authors":"","doi":"10.1016/j.dib.2024.110878","DOIUrl":null,"url":null,"abstract":"<div><p>The proliferation of urban areas and the concurrent increase in vehicular mobility have escalated the urgency for advanced traffic management solutions. This data article introduces two traffic datasets from Madrid, collected between June 2022 and February 2024, to address the challenges of traffic management in urban areas. The first dataset provides detailed traffic flow measurements (vehicles per hour) from urban sensors and road networks, enriched with weather data, calendar data and road infrastructure details from OpenStreetMap. This combination allows for an in-depth analysis of urban mobility. Through preprocessing, data quality is ensured by eliminating inconsistent sensor readings. The second dataset is enhanced for advanced predictive modelling. It includes time-based transformations and a tailored preprocessing pipeline that standardizes numeric data, applies one-hot encoding to categorical features, and uses ordinal encoding for specific features. In constructing the datasets, we initially employed the k-means algorithm to cluster data from multiple sensors, thereby highlighting the most representative ones. This clustering can be adapted or modified according to the user's needs, ensuring flexibility for various analyses and applications.</p><p>This work underscores the importance of advanced datasets in urban planning and highlights the versatility of these resources for multiple practical applications. We highlight the relevance of the collected data for a variety of essential purposes, including traffic prediction, infrastructure planning, studies on the environmental impact of traffic, event planning, and conducting simulations. These datasets not only provide a solid foundation for academic research but also for designing and implementing more effective and sustainable traffic policies. Furthermore, all related datasets, source code, and documentation have been made publicly available, encouraging further research and practical applications in traffic management and urban planning.</p></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2352340924008412/pdfft?md5=4952f4dd7c36dbe760951f462a20afb2&pid=1-s2.0-S2352340924008412-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340924008412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

The proliferation of urban areas and the concurrent increase in vehicular mobility have escalated the urgency for advanced traffic management solutions. This data article introduces two traffic datasets from Madrid, collected between June 2022 and February 2024, to address the challenges of traffic management in urban areas. The first dataset provides detailed traffic flow measurements (vehicles per hour) from urban sensors and road networks, enriched with weather data, calendar data and road infrastructure details from OpenStreetMap. This combination allows for an in-depth analysis of urban mobility. Through preprocessing, data quality is ensured by eliminating inconsistent sensor readings. The second dataset is enhanced for advanced predictive modelling. It includes time-based transformations and a tailored preprocessing pipeline that standardizes numeric data, applies one-hot encoding to categorical features, and uses ordinal encoding for specific features. In constructing the datasets, we initially employed the k-means algorithm to cluster data from multiple sensors, thereby highlighting the most representative ones. This clustering can be adapted or modified according to the user's needs, ensuring flexibility for various analyses and applications.

This work underscores the importance of advanced datasets in urban planning and highlights the versatility of these resources for multiple practical applications. We highlight the relevance of the collected data for a variety of essential purposes, including traffic prediction, infrastructure planning, studies on the environmental impact of traffic, event planning, and conducting simulations. These datasets not only provide a solid foundation for academic research but also for designing and implementing more effective and sustainable traffic policies. Furthermore, all related datasets, source code, and documentation have been made publicly available, encouraging further research and practical applications in traffic management and urban planning.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
丰富马德里市交通数据集:整合交通传感器数据、道路基础设施数据、日历数据和天气数据
随着城市地区的不断扩大和车辆流动性的同步增长,对先进交通管理解决方案的需求日益迫切。本文介绍了 2022 年 6 月至 2024 年 2 月期间收集的两个马德里交通数据集,以应对城市地区交通管理的挑战。第一个数据集提供了来自城市传感器和道路网络的详细交通流量测量数据(每小时车辆数),并丰富了来自 OpenStreetMap 的天气数据、日历数据和道路基础设施详细信息。通过这种组合,可以对城市交通进行深入分析。通过预处理,消除了不一致的传感器读数,确保了数据质量。第二个数据集针对高级预测建模进行了增强。它包括基于时间的转换和量身定制的预处理流水线,可对数字数据进行标准化处理,对分类特征应用单次编码,并对特定特征使用序数编码。在构建数据集时,我们首先使用了 k-means 算法对来自多个传感器的数据进行聚类,从而突出最具代表性的数据。这项工作强调了先进数据集在城市规划中的重要性,并突出了这些资源在多种实际应用中的多功能性。我们强调了所收集的数据与各种基本用途的相关性,包括交通预测、基础设施规划、交通对环境影响的研究、活动规划以及进行模拟。这些数据集不仅为学术研究奠定了坚实的基础,还有助于设计和实施更有效、更可持续的交通政策。此外,所有相关数据集、源代码和文档都已公开,以鼓励交通管理和城市规划方面的进一步研究和实际应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
期刊最新文献
A semi-labelled dataset for fault detection in air handling units from a large-scale office Innovation system functions: Survey data of additive manufacturing enterprises in South Africa Dataset of 16S rRNA gene sequences of 111 healthy and Newcastle disease infected caecal samples from multiple chicken breeds of Pakistan A dental intraoral image dataset of gingivitis for image captioning Multi-datasets for different keyboard key sound recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1