Hongyi Li , Ting Yang , Yiming Du , Yining Tan , Zifa Wang
{"title":"Interpreting hourly mass concentrations of PM2.5 chemical components with an optimal deep-learning model","authors":"Hongyi Li , Ting Yang , Yiming Du , Yining Tan , Zifa Wang","doi":"10.1016/j.jes.2024.03.037","DOIUrl":null,"url":null,"abstract":"<div><p>PM<sub>2.5</sub> constitutes a complex and diverse mixture that significantly impacts the environment, human health, and climate change. However, existing observation and numerical simulation techniques have limitations, such as a lack of data, high acquisition costs, and multiple uncertainties. These limitations hinder the acquisition of comprehensive information on PM<sub>2.5</sub> chemical composition and effectively implement refined air pollution protection and control strategies. In this study, we developed an optimal deep learning model to acquire hourly mass concentrations of key PM<sub>2.5</sub> chemical components without complex chemical analysis. The model was trained using a randomly partitioned multivariate dataset arranged in chronological order, including atmospheric state indicators, which previous studies did not consider. Our results showed that the correlation coefficients of key chemical components were no less than 0.96, and the root mean square errors ranged from 0.20 to 2.11 µg/m<sup>3</sup> for the entire process (training and testing combined). The model accurately captured the temporal characteristics of key chemical components, outperforming typical machine-learning models, previous studies, and global reanalysis datasets (such as Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) and Copernicus Atmosphere Monitoring Service ReAnalysis (CAMSRA)). We also quantified the feature importance using the random forest model, which showed that PM<sub>2.5</sub>, PM<sub>1</sub>, visibility, and temperature were the most influential variables for key chemical components. In conclusion, this study presents a practical approach to accurately obtain chemical composition information that can contribute to filling missing data, improved air pollution monitoring and source identification. This approach has the potential to enhance air pollution control strategies and promote public health and environmental sustainability.</p></div>","PeriodicalId":15788,"journal":{"name":"Journal of Environmental Sciences-china","volume":"151 ","pages":"Pages 125-139"},"PeriodicalIF":5.9000,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Sciences-china","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1001074224001530","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
PM2.5 constitutes a complex and diverse mixture that significantly impacts the environment, human health, and climate change. However, existing observation and numerical simulation techniques have limitations, such as a lack of data, high acquisition costs, and multiple uncertainties. These limitations hinder the acquisition of comprehensive information on PM2.5 chemical composition and effectively implement refined air pollution protection and control strategies. In this study, we developed an optimal deep learning model to acquire hourly mass concentrations of key PM2.5 chemical components without complex chemical analysis. The model was trained using a randomly partitioned multivariate dataset arranged in chronological order, including atmospheric state indicators, which previous studies did not consider. Our results showed that the correlation coefficients of key chemical components were no less than 0.96, and the root mean square errors ranged from 0.20 to 2.11 µg/m3 for the entire process (training and testing combined). The model accurately captured the temporal characteristics of key chemical components, outperforming typical machine-learning models, previous studies, and global reanalysis datasets (such as Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) and Copernicus Atmosphere Monitoring Service ReAnalysis (CAMSRA)). We also quantified the feature importance using the random forest model, which showed that PM2.5, PM1, visibility, and temperature were the most influential variables for key chemical components. In conclusion, this study presents a practical approach to accurately obtain chemical composition information that can contribute to filling missing data, improved air pollution monitoring and source identification. This approach has the potential to enhance air pollution control strategies and promote public health and environmental sustainability.
期刊介绍:
The Journal of Environmental Sciences is an international journal started in 1989. The journal is devoted to publish original, peer-reviewed research papers on main aspects of environmental sciences, such as environmental chemistry, environmental biology, ecology, geosciences and environmental physics. Appropriate subjects include basic and applied research on atmospheric, terrestrial and aquatic environments, pollution control and abatement technology, conservation of natural resources, environmental health and toxicology. Announcements of international environmental science meetings and other recent information are also included.