{"title":"Seq2seq modelling for cross-site temporal forecasting of urban air pollutant concentrations leveraging sensor data","authors":"Jiading Zhong, Jianlin Liu","doi":"10.1016/j.buildenv.2024.112463","DOIUrl":null,"url":null,"abstract":"<div><div>Urban air pollution presents significant health risks, requiring effective monitoring, forecasting and controlling strategies. Comprehensive monitoring is often hindered by the limited availability of measurement data. This study introduces a seq2seq model designed to perform operational forecasting of air pollutant concentrations at an unmonitored site using an upwind sensor. The effectiveness of seq2seq model is systematically evaluated through test cases that aim to explore effects of several influencing factors, including network architecture, embedding method, model complexity, and sensor placement. The test cases involve 252 seq2seq model candidates, which are trained and tested on a synthetic dataset established using a validated large eddy simulation (LES) model for the typical street canyon urban setting, ensuring controlled conditions and reproducibility. Additionally, scheduled sampling is used during model training to mitigate error accumulation. Results demonstrate that a decoder-only model is only capable of making flatline predictions, while a well-tuned seq2seq model informed by a strategically placed upwind sensor provides reasonable operational predictions. Despite its simplicity, the linear network, using the positional embedding, a two-layer structure, and the sensor placed 0.17<em>H</em> above the ground, exhibits the best performance among seq2seq models. The study also challenges a priori belief that favors higher sensor locations based on statistical similarity measures, as the sensor at 0.17<em>H</em> enables the best performing model. These findings underscore the potential of seq2seq models to enhance urban air quality monitoring, offering a robust scientific basis for informed urban planning and pollution management strategies.</div></div>","PeriodicalId":9273,"journal":{"name":"Building and Environment","volume":"269 ","pages":"Article 112463"},"PeriodicalIF":7.1000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Building and Environment","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360132324013040","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Urban air pollution presents significant health risks, requiring effective monitoring, forecasting and controlling strategies. Comprehensive monitoring is often hindered by the limited availability of measurement data. This study introduces a seq2seq model designed to perform operational forecasting of air pollutant concentrations at an unmonitored site using an upwind sensor. The effectiveness of seq2seq model is systematically evaluated through test cases that aim to explore effects of several influencing factors, including network architecture, embedding method, model complexity, and sensor placement. The test cases involve 252 seq2seq model candidates, which are trained and tested on a synthetic dataset established using a validated large eddy simulation (LES) model for the typical street canyon urban setting, ensuring controlled conditions and reproducibility. Additionally, scheduled sampling is used during model training to mitigate error accumulation. Results demonstrate that a decoder-only model is only capable of making flatline predictions, while a well-tuned seq2seq model informed by a strategically placed upwind sensor provides reasonable operational predictions. Despite its simplicity, the linear network, using the positional embedding, a two-layer structure, and the sensor placed 0.17H above the ground, exhibits the best performance among seq2seq models. The study also challenges a priori belief that favors higher sensor locations based on statistical similarity measures, as the sensor at 0.17H enables the best performing model. These findings underscore the potential of seq2seq models to enhance urban air quality monitoring, offering a robust scientific basis for informed urban planning and pollution management strategies.
期刊介绍:
Building and Environment, an international journal, is dedicated to publishing original research papers, comprehensive review articles, editorials, and short communications in the fields of building science, urban physics, and human interaction with the indoor and outdoor built environment. The journal emphasizes innovative technologies and knowledge verified through measurement and analysis. It covers environmental performance across various spatial scales, from cities and communities to buildings and systems, fostering collaborative, multi-disciplinary research with broader significance.