Karn Vohra , Madhumitha S. , Abhishek Chakraborty , Hitansh Shah , Bharrathi AS. , Jayaraju Pakki
{"title":"Urgent issues regarding real-time air quality monitoring data in India: Unveiling solutions and implications for policy and health","authors":"Karn Vohra , Madhumitha S. , Abhishek Chakraborty , Hitansh Shah , Bharrathi AS. , Jayaraju Pakki","doi":"10.1016/j.aeaoa.2024.100308","DOIUrl":null,"url":null,"abstract":"<div><div>Deteriorating air quality in India has heightened the emphasis on air quality monitoring. This has resulted in a 16-fold increase in the number of Continuous Ambient Air Quality Monitoring Sites (CAAQMS) across the country over the last decade. The CAAQMS datasets are used globally, but concerns about data quality have also been raised. Missing is a comprehensive assessment quantifying the scale of these air quality data issues and the impact these have on policy- and health-relevant metrics. So, we develop the first open-source automated tool to identify and address data issues and apply it to six pollutants (PM<sub>2.5</sub>, PM<sub>10</sub>, NO, NO<sub>2</sub>, NO<sub>x</sub>, and O<sub>3</sub>) from 213 CAAQMS in 2019–2023. Typical issues in CAAQMS datasets include similar values that repeat continuously for durations exceeding 24 h and outliers that occur at almost the same time every day. We also reveal hidden issues for nitrogen oxides (NO<sub>x</sub> ≈ NO + NO<sub>2</sub>) that include (1) reporting of NO and NO<sub>2</sub> in units not compliant with the Central Pollution Control Board parameter reporting protocol and (2) inconsistency in data reporting when either NO or NO<sub>2</sub> is recorded as “Not Available” but valid NO<sub>x</sub> data is reported. The proportion of data influenced by consecutively similar observations and outliers has remained fairly consistent but sites affected by unit inconsistency issues have grown between 2019 and 2023. No significant difference in data quality issues was observed between CAAQMS maintained by central and state pollution control boards illustrating the country-wide extent of these issues. We find that removing consecutively similar observations and outliers changes annual mean pollutant concentrations by only <5% but correcting for the yet unaddressed issue of unit inconsistency increases annual mean NO<sub>2</sub> concentrations by a dramatic >80% for sites affected by it. We conducted a separate analysis to confirm that the unit inconsistency issue was not identified and addressed in multiple peer-reviewed studies examining the impact of the COVID-19 lockdown, and this is likely to have resulted in reporting of inaccurate absolute air quality improvements.</div><div>A substantial impact of data cleaning on air quality-derived metrics is observed for nitrogen oxides. The impact is marginal for other pollutants. We find that after data cleaning, 23 sites in 2019 became non-compliant with national ambient air quality standards for NO<sub>2</sub>. Worsening of NO<sub>2</sub> data quality over the years increased the number of non-compliant sites to 45 in 2023 after using our tool. For PM<sub>2.5</sub> and PM<sub>10</sub>, fewer than 5 sites changed compliance post-data cleaning. Given marginal changes in concentrations of PM<sub>2.5</sub> and O<sub>3</sub>, premature mortality attributable to exposure to these in Delhi, Mumbai, and Kolkata changed only by <10% after data cleaning. The impact on premature mortality was substantial for exposure to NO<sub>2,</sub> with NO<sub>2</sub>-related premature deaths increasing by 8–67% in the three megacities. These findings have implications for the global research community and policy formulation and underscore the urgent need for ratified CAAQMS data.</div></div>","PeriodicalId":37150,"journal":{"name":"Atmospheric Environment: X","volume":"25 ","pages":"Article 100308"},"PeriodicalIF":3.8000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Environment: X","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590162124000753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Deteriorating air quality in India has heightened the emphasis on air quality monitoring. This has resulted in a 16-fold increase in the number of Continuous Ambient Air Quality Monitoring Sites (CAAQMS) across the country over the last decade. The CAAQMS datasets are used globally, but concerns about data quality have also been raised. Missing is a comprehensive assessment quantifying the scale of these air quality data issues and the impact these have on policy- and health-relevant metrics. So, we develop the first open-source automated tool to identify and address data issues and apply it to six pollutants (PM2.5, PM10, NO, NO2, NOx, and O3) from 213 CAAQMS in 2019–2023. Typical issues in CAAQMS datasets include similar values that repeat continuously for durations exceeding 24 h and outliers that occur at almost the same time every day. We also reveal hidden issues for nitrogen oxides (NOx ≈ NO + NO2) that include (1) reporting of NO and NO2 in units not compliant with the Central Pollution Control Board parameter reporting protocol and (2) inconsistency in data reporting when either NO or NO2 is recorded as “Not Available” but valid NOx data is reported. The proportion of data influenced by consecutively similar observations and outliers has remained fairly consistent but sites affected by unit inconsistency issues have grown between 2019 and 2023. No significant difference in data quality issues was observed between CAAQMS maintained by central and state pollution control boards illustrating the country-wide extent of these issues. We find that removing consecutively similar observations and outliers changes annual mean pollutant concentrations by only <5% but correcting for the yet unaddressed issue of unit inconsistency increases annual mean NO2 concentrations by a dramatic >80% for sites affected by it. We conducted a separate analysis to confirm that the unit inconsistency issue was not identified and addressed in multiple peer-reviewed studies examining the impact of the COVID-19 lockdown, and this is likely to have resulted in reporting of inaccurate absolute air quality improvements.
A substantial impact of data cleaning on air quality-derived metrics is observed for nitrogen oxides. The impact is marginal for other pollutants. We find that after data cleaning, 23 sites in 2019 became non-compliant with national ambient air quality standards for NO2. Worsening of NO2 data quality over the years increased the number of non-compliant sites to 45 in 2023 after using our tool. For PM2.5 and PM10, fewer than 5 sites changed compliance post-data cleaning. Given marginal changes in concentrations of PM2.5 and O3, premature mortality attributable to exposure to these in Delhi, Mumbai, and Kolkata changed only by <10% after data cleaning. The impact on premature mortality was substantial for exposure to NO2, with NO2-related premature deaths increasing by 8–67% in the three megacities. These findings have implications for the global research community and policy formulation and underscore the urgent need for ratified CAAQMS data.