{"title":"Revealing representative day-types in transport networks using traffic data clustering","authors":"","doi":"10.1080/15472450.2023.2205020","DOIUrl":null,"url":null,"abstract":"<div><p>Recognition of spatio-temporal traffic patterns at the network-wide level plays an important role in data-driven intelligent transport systems (ITS) and is a basis for applications such as short-term prediction and scenario-based traffic management. Common practice in the transport literature is to rely on well-known general unsupervised machine-learning methods (e.g., k-means, hierarchical, spectral, DBSCAN) to select the most representative structure and number of day-types based solely on internal evaluation indices. These are easy to calculate but are limited since they only use information in the clustered dataset itself. In addition, the quality of clustering should ideally be demonstrated by external validation criteria, by expert assessment or the performance in its intended application. The main contribution of this paper is to test and compare the common practice of internal validation with external validation criteria represented by the application to short-term prediction, which also serves as a proxy for more general traffic management applications. When compared to external evaluation using short-term prediction, internal evaluation methods have a tendency to underestimate the number of representative day-types needed for the application. Additionally, the paper investigates the impact of using dimensionality reduction. By using just 0.1% of the original dataset dimensions, very similar clustering and prediction performance can be achieved, with up to 20 times lower computational costs, depending on the clustering method. K-means and agglomerative clustering may be the most scalable methods, using up to 60 times fewer computational resources for very similar prediction performance to the p-median clustering.</p></div>","PeriodicalId":54792,"journal":{"name":"Journal of Intelligent Transportation Systems","volume":"28 5","pages":"Pages 695-718"},"PeriodicalIF":2.8000,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/org/science/article/pii/S1547245023000841","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Recognition of spatio-temporal traffic patterns at the network-wide level plays an important role in data-driven intelligent transport systems (ITS) and is a basis for applications such as short-term prediction and scenario-based traffic management. Common practice in the transport literature is to rely on well-known general unsupervised machine-learning methods (e.g., k-means, hierarchical, spectral, DBSCAN) to select the most representative structure and number of day-types based solely on internal evaluation indices. These are easy to calculate but are limited since they only use information in the clustered dataset itself. In addition, the quality of clustering should ideally be demonstrated by external validation criteria, by expert assessment or the performance in its intended application. The main contribution of this paper is to test and compare the common practice of internal validation with external validation criteria represented by the application to short-term prediction, which also serves as a proxy for more general traffic management applications. When compared to external evaluation using short-term prediction, internal evaluation methods have a tendency to underestimate the number of representative day-types needed for the application. Additionally, the paper investigates the impact of using dimensionality reduction. By using just 0.1% of the original dataset dimensions, very similar clustering and prediction performance can be achieved, with up to 20 times lower computational costs, depending on the clustering method. K-means and agglomerative clustering may be the most scalable methods, using up to 60 times fewer computational resources for very similar prediction performance to the p-median clustering.
期刊介绍:
The Journal of Intelligent Transportation Systems is devoted to scholarly research on the development, planning, management, operation and evaluation of intelligent transportation systems. Intelligent transportation systems are innovative solutions that address contemporary transportation problems. They are characterized by information, dynamic feedback and automation that allow people and goods to move efficiently. They encompass the full scope of information technologies used in transportation, including control, computation and communication, as well as the algorithms, databases, models and human interfaces. The emergence of these technologies as a new pathway for transportation is relatively new.
The Journal of Intelligent Transportation Systems is especially interested in research that leads to improved planning and operation of the transportation system through the application of new technologies. The journal is particularly interested in research that adds to the scientific understanding of the impacts that intelligent transportation systems can have on accessibility, congestion, pollution, safety, security, noise, and energy and resource consumption.
The journal is inter-disciplinary, and accepts work from fields of engineering, economics, planning, policy, business and management, as well as any other disciplines that contribute to the scientific understanding of intelligent transportation systems. The journal is also multi-modal, and accepts work on intelligent transportation for all forms of ground, air and water transportation. Example topics include the role of information systems in transportation, traffic flow and control, vehicle control, routing and scheduling, traveler response to dynamic information, planning for ITS innovations, evaluations of ITS field operational tests, ITS deployment experiences, automated highway systems, vehicle control systems, diffusion of ITS, and tools/software for analysis of ITS.