{"title":"A survey on air pollutant PM2.5 prediction using random forest model","authors":"S. Babu, B. Thomas","doi":"10.34172/ehem.2023.18","DOIUrl":null,"url":null,"abstract":"Background: One of the most critical contributors to air pollution is particulate matter (PM2.5) that its acute or chronic exposure causes serious health effects to human. Accurate forecasting of PM2.5 concentration is essential for air pollution control and prevention of health complications. A survey of the available scientific literature on random forest model for PM2.5 prediction is presented here. Methods: The scientific literature is extracted from Science Direct database based on a set of specified search criteria. The input features, data length, and evaluation parameters used in PM2.5 prediction were analyzed in this study. Results: The study shows that majority of the publications are aimed at the daily prediction of outdoor PM2.5. Most publications base their PM2.5 prediction on features aerosol optical depth (AOD) and boundary layer height (BLH). PM10 and NO2 are the main air pollutants employed in the PM2.5 estimation. Majority studies utilized input data lengths covering more than one year, and the effectiveness of prediction models are unaffected by the length of investigation. The coefficient of determination, R2 , is the primary evaluation parameter used in all publications. The majority of research study indicated R2 values greater than 0.85, demonstrating the reasonable dependability and efficiency of random forest regression-based PM2.5 prediction models. Conclusion: The study demonstrates that the publications use a variety of meteorological and geological features for PM2.5 estimation, depending on the context of the research as well as data accessibility. The findings demonstrate that it is hard to pinpoint the optimal model in any particular way.","PeriodicalId":51877,"journal":{"name":"Environmental Health Engineering and Management Journal","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Health Engineering and Management Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34172/ehem.2023.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 1
Abstract
Background: One of the most critical contributors to air pollution is particulate matter (PM2.5) that its acute or chronic exposure causes serious health effects to human. Accurate forecasting of PM2.5 concentration is essential for air pollution control and prevention of health complications. A survey of the available scientific literature on random forest model for PM2.5 prediction is presented here. Methods: The scientific literature is extracted from Science Direct database based on a set of specified search criteria. The input features, data length, and evaluation parameters used in PM2.5 prediction were analyzed in this study. Results: The study shows that majority of the publications are aimed at the daily prediction of outdoor PM2.5. Most publications base their PM2.5 prediction on features aerosol optical depth (AOD) and boundary layer height (BLH). PM10 and NO2 are the main air pollutants employed in the PM2.5 estimation. Majority studies utilized input data lengths covering more than one year, and the effectiveness of prediction models are unaffected by the length of investigation. The coefficient of determination, R2 , is the primary evaluation parameter used in all publications. The majority of research study indicated R2 values greater than 0.85, demonstrating the reasonable dependability and efficiency of random forest regression-based PM2.5 prediction models. Conclusion: The study demonstrates that the publications use a variety of meteorological and geological features for PM2.5 estimation, depending on the context of the research as well as data accessibility. The findings demonstrate that it is hard to pinpoint the optimal model in any particular way.