卡塔尔多哈颗粒物数据的统计分析

C. Taylor, A. Yousif, Kassim S. Mwitondi
{"title":"卡塔尔多哈颗粒物数据的统计分析","authors":"C. Taylor, A. Yousif, Kassim S. Mwitondi","doi":"10.2495/AIR180101","DOIUrl":null,"url":null,"abstract":"Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness.","PeriodicalId":165416,"journal":{"name":"Air Pollution XXVI","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"STATISTICAL ANALYSIS OF PARTICULATE MATTER DATA IN DOHA, QATAR\",\"authors\":\"C. Taylor, A. Yousif, Kassim S. Mwitondi\",\"doi\":\"10.2495/AIR180101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness.\",\"PeriodicalId\":165416,\"journal\":{\"name\":\"Air Pollution XXVI\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Air Pollution XXVI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2495/AIR180101\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Air Pollution XXVI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2495/AIR180101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

多哈的污染是用被动、主动和自动采样来测量的。在本文中,我们考虑自动采样的数据,其中每小时不断收集和分析各种污染物。在每个站点,样品在线实时分析,数据存储在分析仪或单独的记录仪中,因此可以通过调制解调器远程下载。所产生的准确性使污染事件能够详细分析,并与交通流量、气象和其他变量有关。在6年多的时间里,在3个不同的地点每小时收集一次数据,测量各种污染物,例如臭氧、氮氧化物、二氧化硫、一氧化碳、四氢大麻酚、甲烷和颗粒物(PM1.0、PM2.5和PM10),以及湿度、温度、风速和风向等气象数据。尽管在数据收集过程中非常小心,但当设备发生故障时(通常是由于更极端的条件),所得到的数据会有很长一段缺失值。我们的分析是双重的。首先,我们考虑通过输入缺失值(包括已识别的异常值)来“清理”数据的方法。第二个方面特别考虑使用当前(和以前)的污染和气象数据提前24小时预测每种颗粒物(PM1.0、PM2.5和PM10)。在这种情况下,我们使用向量自回归模型,与决策树进行比较,并提出明确适应缺失数据的变量选择标准。我们的结果表明,没有变量转换的回归树模型表现最好,并且试图推断缺失值受到非随机缺失的阻碍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
STATISTICAL ANALYSIS OF PARTICULATE MATTER DATA IN DOHA, QATAR
Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A PRELIMINARY STUDY TO INVESTIGATE THE RELATIONSHIP BETWEEN INDOOR ENVIRONMENT AND ITS EFFECT ON PHYSICAL AND MENTAL HEALTH PROJECTING THE ENVIRONMENTAL IMPACT OF DIESEL CARS ON GASEOUS POLLUTANTS, PM2.5 AND CO2 IN A METROPOLITAN AREA FACILITATING STAKEHOLDER DIALOGUES ON A CARBON NEUTRAL CITY: WE NEED TO TALK ABOUT CARBON (AND AIR QUALITY) DETECTION AND CHARACTERIZATION OF CHEMICAL AND BIOLOGICAL AEROSOLS USING LASER-TRAPPING SINGLE-PARTICLE RAMAN SPECTROSCOPY SPATIAL HIGH-RESOLUTION MAPPING OF NATIONAL EMISSIONS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1