Bo Xu , Haofei Yu , Zongbo Shi , Jinxing Liu , Yuting Wei , Zhongcheng Zhang , Yanqi Huangfu , Han Xu , Yue Li , Linlin Zhang , Yinchang Feng , Guoliang Shi
{"title":"Knowledge-guided machine learning reveals pivotal drivers for gas-to-particle conversion of atmospheric nitrate","authors":"Bo Xu , Haofei Yu , Zongbo Shi , Jinxing Liu , Yuting Wei , Zhongcheng Zhang , Yanqi Huangfu , Han Xu , Yue Li , Linlin Zhang , Yinchang Feng , Guoliang Shi","doi":"10.1016/j.ese.2023.100333","DOIUrl":null,"url":null,"abstract":"<div><p>Particulate nitrate, a key component of fine particles, forms through the intricate gas-to-particle conversion process. This process is regulated by the gas-to-particle conversion coefficient of nitrate (ε(NO<sub>3</sub><sup>−</sup>)). The mechanism between ε(NO<sub>3</sub><sup>−</sup>) and its drivers is highly complex and nonlinear, and can be characterized by machine learning methods. However, conventional machine learning often yields results that lack clear physical meaning and may even contradict established physical/chemical mechanisms due to the influence of ambient factors. It urgently needs an alternative approach that possesses transparent physical interpretations and provides deeper insights into the impact of ε(NO<sub>3</sub><sup>−</sup>). Here we introduce a supervised machine learning approach—the multilevel nested random forest guided by theory approaches. Our approach robustly identifies NH<sub>4</sub><sup>+</sup>, SO<sub>4</sub><sup>2−</sup>, and temperature as pivotal drivers for ε(NO<sub>3</sub><sup>−</sup>). Notably, substantial disparities exist between the outcomes of traditional random forest analysis and the anticipated actual results. Furthermore, our approach underscores the significance of NH<sub>4</sub><sup>+</sup> during both daytime (30%) and nighttime (40%) periods, while appropriately downplaying the influence of some less relevant drivers in comparison to conventional random forest analysis. This research underscores the transformative potential of integrating domain knowledge with machine learning in atmospheric studies.</p></div>","PeriodicalId":34434,"journal":{"name":"Environmental Science and Ecotechnology","volume":"19 ","pages":"Article 100333"},"PeriodicalIF":14.0000,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666498423000984/pdfft?md5=674a9d66ca6ffce1552e9af0a9839128&pid=1-s2.0-S2666498423000984-main.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science and Ecotechnology","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666498423000984","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 1
Abstract
Particulate nitrate, a key component of fine particles, forms through the intricate gas-to-particle conversion process. This process is regulated by the gas-to-particle conversion coefficient of nitrate (ε(NO3−)). The mechanism between ε(NO3−) and its drivers is highly complex and nonlinear, and can be characterized by machine learning methods. However, conventional machine learning often yields results that lack clear physical meaning and may even contradict established physical/chemical mechanisms due to the influence of ambient factors. It urgently needs an alternative approach that possesses transparent physical interpretations and provides deeper insights into the impact of ε(NO3−). Here we introduce a supervised machine learning approach—the multilevel nested random forest guided by theory approaches. Our approach robustly identifies NH4+, SO42−, and temperature as pivotal drivers for ε(NO3−). Notably, substantial disparities exist between the outcomes of traditional random forest analysis and the anticipated actual results. Furthermore, our approach underscores the significance of NH4+ during both daytime (30%) and nighttime (40%) periods, while appropriately downplaying the influence of some less relevant drivers in comparison to conventional random forest analysis. This research underscores the transformative potential of integrating domain knowledge with machine learning in atmospheric studies.
期刊介绍:
Environmental Science & Ecotechnology (ESE) is an international, open-access journal publishing original research in environmental science, engineering, ecotechnology, and related fields. Authors publishing in ESE can immediately, permanently, and freely share their work. They have license options and retain copyright. Published by Elsevier, ESE is co-organized by the Chinese Society for Environmental Sciences, Harbin Institute of Technology, and the Chinese Research Academy of Environmental Sciences, under the supervision of the China Association for Science and Technology.