Gaussian Model Trees for Traffic Imputation

International Conference on Pattern Recognition Applications and Methods Pub Date : 2019-02-19 DOI:10.5220/0007690502430254

Sebastian Buschjäger, T. Liebig, K. Morik

{"title":"Gaussian Model Trees for Traffic Imputation","authors":"Sebastian Buschjäger, T. Liebig, K. Morik","doi":"10.5220/0007690502430254","DOIUrl":null,"url":null,"abstract":"Traffic congestion is one of the most pressing issues for smart cities. Information on traffic flow can be used to reduce congestion by predicting vehicle counts at unmonitored locations so that counter-measures can be applied before congestion appears. To do so pricy sensors must be distributed sparsely in the city and at important roads in the city center to collect road and vehicle information throughout the city in real-time. Then, Machine Learning models can be applied to predict vehicle counts at unmonitored locations. To be fault-tolerant and increase coverage of the traffic predictions to the suburbs, rural regions, or even neighboring villages, these Machine Learning models should not operate at a central traffic control room but rather be distributed across the city. Gaussian Processes (GP) work well in the context of traffic count prediction, but cannot capitalize on the vast amount of data available in an entire city. Furthermore, Gaussian Processes are a global and centralized model, which requires all measurements to be available at a central computation node. Product of Expert (PoE) models have been proposed as a scalable alternative to Gaussian Processes. A PoE model trains multiple, independent GPs on different subsets of the data and weight individual predictions based on each experts uncertainty. These methods work well, but they assume that experts are independent even though they may share data points. Furthermore, PoE models require exhaustive communication bandwidth between the individual experts to form the final prediction. In this paper we propose a hierarchical Product of Expert model, which consist of multiple layers of small, independent and local GP experts. We view Gaussian Process induction as regularized optimization procedure and utilize this view to derive an efficient algorithm which selects independent regions of the data. Then, we train local expert models on these regions, so that each expert is responsible for a given region. The resulting algorithm scales well for large amounts of data and outperforms flat PoE models in terms of communication cost, model size and predictive performance. Last, we discuss how to deploy these local expert models onto small devices.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007690502430254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Traffic congestion is one of the most pressing issues for smart cities. Information on traffic flow can be used to reduce congestion by predicting vehicle counts at unmonitored locations so that counter-measures can be applied before congestion appears. To do so pricy sensors must be distributed sparsely in the city and at important roads in the city center to collect road and vehicle information throughout the city in real-time. Then, Machine Learning models can be applied to predict vehicle counts at unmonitored locations. To be fault-tolerant and increase coverage of the traffic predictions to the suburbs, rural regions, or even neighboring villages, these Machine Learning models should not operate at a central traffic control room but rather be distributed across the city. Gaussian Processes (GP) work well in the context of traffic count prediction, but cannot capitalize on the vast amount of data available in an entire city. Furthermore, Gaussian Processes are a global and centralized model, which requires all measurements to be available at a central computation node. Product of Expert (PoE) models have been proposed as a scalable alternative to Gaussian Processes. A PoE model trains multiple, independent GPs on different subsets of the data and weight individual predictions based on each experts uncertainty. These methods work well, but they assume that experts are independent even though they may share data points. Furthermore, PoE models require exhaustive communication bandwidth between the individual experts to form the final prediction. In this paper we propose a hierarchical Product of Expert model, which consist of multiple layers of small, independent and local GP experts. We view Gaussian Process induction as regularized optimization procedure and utilize this view to derive an efficient algorithm which selects independent regions of the data. Then, we train local expert models on these regions, so that each expert is responsible for a given region. The resulting algorithm scales well for large amounts of data and outperforms flat PoE models in terms of communication cost, model size and predictive performance. Last, we discuss how to deploy these local expert models onto small devices.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于高斯模型树的交通估算

交通拥堵是智慧城市面临的最紧迫问题之一。交通流量信息可以通过预测未监控地点的车辆数量来减少拥堵，以便在拥堵出现之前采取应对措施。要做到这一点，昂贵的传感器必须稀疏地分布在城市和城市中心的重要道路上，以实时收集整个城市的道路和车辆信息。然后，机器学习模型可以应用于预测未监控位置的车辆数量。为了容错并增加对郊区、农村地区甚至邻近村庄的交通预测的覆盖范围，这些机器学习模型不应该在中央交通控制室运行，而是应该分布在整个城市。高斯过程(GP)在交通计数预测方面工作得很好，但无法利用整个城市的大量可用数据。此外，高斯过程是一个全局和集中的模型，它要求所有的测量都在一个中心计算节点上可用。专家产品(PoE)模型已被提出作为高斯过程的可扩展替代方案。PoE模型在不同的数据子集上训练多个独立的gp，并根据每个专家的不确定性对单个预测进行加权。这些方法很有效，但它们假设专家是独立的，即使他们可能共享数据点。此外，PoE模型需要各个专家之间的详尽通信带宽来形成最终预测。本文提出了一种由多层小的、独立的、局部的GP专家组成的分层专家产品模型。我们将高斯过程归纳视为正则化的优化过程，并利用这一观点推导出一种有效的算法来选择数据的独立区域。然后，我们在这些区域上训练局部专家模型，使每个专家负责一个给定的区域。所得到的算法可以很好地适用于大量数据，并且在通信成本、模型大小和预测性能方面优于扁平PoE模型。最后，我们讨论了如何将这些局部专家模型部署到小型设备上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Pattern Recognition Applications and Methods

自引率

0.00%

发文量