Ground visibility prediction using tree-based and random-forest machine learning algorithm: Comparative study based on atmospheric pollution and atmospheric boundary layer data

IF 3.5 3区环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES Atmospheric Pollution Research Pub Date : 2024-11-01 Epub Date: 2024-07-29 DOI:10.1016/j.apr.2024.102270

Fuzeng Wang , Ruolan Liu , Hao Yan , Duanyang Liu , Lin Han , Shujie Yuan

{"title":"Ground visibility prediction using tree-based and random-forest machine learning algorithm: Comparative study based on atmospheric pollution and atmospheric boundary layer data","authors":"Fuzeng Wang , Ruolan Liu , Hao Yan , Duanyang Liu , Lin Han , Shujie Yuan","doi":"10.1016/j.apr.2024.102270","DOIUrl":null,"url":null,"abstract":"<div><p>To mitigate haze impacts, three visibility simulation schemes were designed using decision tree and random forest algorithms, leveraging atmospheric boundary layer meteorological data, pollutant concentrations, and ground observations. The optimal approach was identified to investigate the boundary layer's effect on simulations. The results showed that the simulation effect of the random forest algorithm for two haze processes was better than that of the decision tree algorithm. In the first haze process, the random forest algorithm had a more significant reduction in root mean square error than the decision tree algorithm in the same visibility range (Scheme 3, visibility<200 m, mean absolute error reduced by 5.9%, root mean square error reduced by 19.1%). Simulation models significantly improved the accuracy of the models by adding atmospheric boundary layer observation data to the two fog-hazes process visibility. However, the addition of atmospheric boundary layer meteorological data in the first haze process had a better improvement effect (random forest: visibility<200 m, mean absolute errors of 25.0 (relative error<12.5%) and 25.5 m (relative error<12.8%) in Scheme 2 and 3, respectively). The addition of atmospheric boundary-layer pollutant concentrations data was more effective in the second haze process (random forest: visibility<200 m, scheme 2 and scheme 3 had mean absolute errors of 25.6 (relative error<12.8%) and 11.1 m (relative error<5.6%), respectively). The influence of atmospheric boundary layer meteorological data and pollutant data on the two fog processes is affected by the cause of the fog process.</p></div>","PeriodicalId":8604,"journal":{"name":"Atmospheric Pollution Research","volume":"15 11","pages":"Article 102270"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1309104224002356","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/29 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

To mitigate haze impacts, three visibility simulation schemes were designed using decision tree and random forest algorithms, leveraging atmospheric boundary layer meteorological data, pollutant concentrations, and ground observations. The optimal approach was identified to investigate the boundary layer's effect on simulations. The results showed that the simulation effect of the random forest algorithm for two haze processes was better than that of the decision tree algorithm. In the first haze process, the random forest algorithm had a more significant reduction in root mean square error than the decision tree algorithm in the same visibility range (Scheme 3, visibility<200 m, mean absolute error reduced by 5.9%, root mean square error reduced by 19.1%). Simulation models significantly improved the accuracy of the models by adding atmospheric boundary layer observation data to the two fog-hazes process visibility. However, the addition of atmospheric boundary layer meteorological data in the first haze process had a better improvement effect (random forest: visibility<200 m, mean absolute errors of 25.0 (relative error<12.5%) and 25.5 m (relative error<12.8%) in Scheme 2 and 3, respectively). The addition of atmospheric boundary-layer pollutant concentrations data was more effective in the second haze process (random forest: visibility<200 m, scheme 2 and scheme 3 had mean absolute errors of 25.6 (relative error<12.8%) and 11.1 m (relative error<5.6%), respectively). The influence of atmospheric boundary layer meteorological data and pollutant data on the two fog processes is affected by the cause of the fog process.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用基于树和随机森林的机器学习算法预测地面能见度：基于大气污染和大气边界层数据的比较研究

为减轻雾霾影响，利用决策树和随机森林算法，利用大气边界层气象数据、污染物浓度和地面观测数据，设计了三种能见度模拟方案。确定了最佳方法，以研究边界层对模拟的影响。结果表明，随机森林算法对两个雾霾过程的模拟效果优于决策树算法。在第一个雾霾过程中，在相同能见度范围内，随机森林算法比决策树算法更显著地降低了均方根误差（方案 3，能见度<200 米，平均绝对误差降低了 5.9%，均方根误差降低了 19.1%）。模拟模型通过在两个雾霞过程能见度中加入大气边界层观测数据，大大提高了模型的准确性。但是，在第一次雾霾过程中加入大气边界层气象数据的改善效果更好（随机森林：能见度<200 米，方案 2 和方案 3 的平均绝对误差分别为 25.0 米（相对误差<12.5%）和 25.5 米（相对误差<12.8%））。加入大气边界层污染物浓度数据对第二次灰霾过程更有效（随机森林：能见度<200 米，方案 2 和方案 3 的平均绝对误差分别为 25.6（相对误差<12.8%）和 11.1 米（相对误差<5.6%））。大气边界层气象数据和污染物数据对两次雾过程的影响受雾过程成因的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Atmospheric Pollution Research ENVIRONMENTAL SCIENCES-

CiteScore

8.30

自引率

6.70%

发文量

256

审稿时长

36 days

期刊介绍： Atmospheric Pollution Research (APR) is an international journal designed for the publication of articles on air pollution. Papers should present novel experimental results, theory and modeling of air pollution on local, regional, or global scales. Areas covered are research on inorganic, organic, and persistent organic air pollutants, air quality monitoring, air quality management, atmospheric dispersion and transport, air-surface (soil, water, and vegetation) exchange of pollutants, dry and wet deposition, indoor air quality, exposure assessment, health effects, satellite measurements, natural emissions, atmospheric chemistry, greenhouse gases, and effects on climate change.