{"title":"非负和非高斯pm2.5数据的异构图形模型","authors":"Jiaqi Zhang, Xinyan Fan, Yang Li, Shuangge Ma","doi":"10.1111/rssc.12575","DOIUrl":null,"url":null,"abstract":"<p>Studies on the conditional relationships between \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> concentrations among different regions are of great interest for the joint prevention and control of air pollution. Because of seasonal changes in atmospheric conditions, spatial patterns of \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> may differ throughout the year. Additionally, concentration data are both non-negative and non-Gaussian. These data features pose significant challenges to existing methods. This study proposes a heterogeneous graphical model for non-negative and non-Gaussian data via the score matching loss. The proposed method simultaneously clusters multiple datasets and estimates a graph for variables with complex properties in each cluster. Furthermore, our model involves a network that indicate similarity among datasets, and this network can have additional applications. In simulation studies, the proposed method outperforms competing alternatives in both clustering and edge identification. We also analyse the \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> concentrations' spatial correlations in Taiwan's regions using data obtained in year 2019 from 67 air-quality monitoring stations. The 12 months are clustered into four groups: January–March, April, May–September and October–December, and the corresponding graphs have 153, 57, 86 and 167 edges respectively. The results show obvious seasonality, which is consistent with the meteorological literature. Geographically, the \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> concentrations of north and south Taiwan regions correlate more respectively. These results can provide valuable information for developing joint air-quality control strategies.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"71 5","pages":"1303-1329"},"PeriodicalIF":1.0000,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Heterogeneous graphical model for non-negative and non-Gaussian \\n \\n \\n PM\\n 2.5\\n \\n data\",\"authors\":\"Jiaqi Zhang, Xinyan Fan, Yang Li, Shuangge Ma\",\"doi\":\"10.1111/rssc.12575\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Studies on the conditional relationships between \\n<math>\\n <mrow>\\n <msub>\\n <mtext>PM</mtext>\\n <mn>2.5</mn>\\n </msub>\\n </mrow></math> concentrations among different regions are of great interest for the joint prevention and control of air pollution. Because of seasonal changes in atmospheric conditions, spatial patterns of \\n<math>\\n <mrow>\\n <msub>\\n <mtext>PM</mtext>\\n <mn>2.5</mn>\\n </msub>\\n </mrow></math> may differ throughout the year. Additionally, concentration data are both non-negative and non-Gaussian. These data features pose significant challenges to existing methods. This study proposes a heterogeneous graphical model for non-negative and non-Gaussian data via the score matching loss. The proposed method simultaneously clusters multiple datasets and estimates a graph for variables with complex properties in each cluster. Furthermore, our model involves a network that indicate similarity among datasets, and this network can have additional applications. In simulation studies, the proposed method outperforms competing alternatives in both clustering and edge identification. We also analyse the \\n<math>\\n <mrow>\\n <msub>\\n <mtext>PM</mtext>\\n <mn>2.5</mn>\\n </msub>\\n </mrow></math> concentrations' spatial correlations in Taiwan's regions using data obtained in year 2019 from 67 air-quality monitoring stations. The 12 months are clustered into four groups: January–March, April, May–September and October–December, and the corresponding graphs have 153, 57, 86 and 167 edges respectively. The results show obvious seasonality, which is consistent with the meteorological literature. Geographically, the \\n<math>\\n <mrow>\\n <msub>\\n <mtext>PM</mtext>\\n <mn>2.5</mn>\\n </msub>\\n </mrow></math> concentrations of north and south Taiwan regions correlate more respectively. These results can provide valuable information for developing joint air-quality control strategies.</p>\",\"PeriodicalId\":49981,\"journal\":{\"name\":\"Journal of the Royal Statistical Society Series C-Applied Statistics\",\"volume\":\"71 5\",\"pages\":\"1303-1329\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2022-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Royal Statistical Society Series C-Applied Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/rssc.12575\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series C-Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/rssc.12575","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Heterogeneous graphical model for non-negative and non-Gaussian
PM
2.5
data
Studies on the conditional relationships between
concentrations among different regions are of great interest for the joint prevention and control of air pollution. Because of seasonal changes in atmospheric conditions, spatial patterns of
may differ throughout the year. Additionally, concentration data are both non-negative and non-Gaussian. These data features pose significant challenges to existing methods. This study proposes a heterogeneous graphical model for non-negative and non-Gaussian data via the score matching loss. The proposed method simultaneously clusters multiple datasets and estimates a graph for variables with complex properties in each cluster. Furthermore, our model involves a network that indicate similarity among datasets, and this network can have additional applications. In simulation studies, the proposed method outperforms competing alternatives in both clustering and edge identification. We also analyse the
concentrations' spatial correlations in Taiwan's regions using data obtained in year 2019 from 67 air-quality monitoring stations. The 12 months are clustered into four groups: January–March, April, May–September and October–December, and the corresponding graphs have 153, 57, 86 and 167 edges respectively. The results show obvious seasonality, which is consistent with the meteorological literature. Geographically, the
concentrations of north and south Taiwan regions correlate more respectively. These results can provide valuable information for developing joint air-quality control strategies.
期刊介绍:
The Journal of the Royal Statistical Society, Series C (Applied Statistics) is a journal of international repute for statisticians both inside and outside the academic world. The journal is concerned with papers which deal with novel solutions to real life statistical problems by adapting or developing methodology, or by demonstrating the proper application of new or existing statistical methods to them. At their heart therefore the papers in the journal are motivated by examples and statistical data of all kinds. The subject-matter covers the whole range of inter-disciplinary fields, e.g. applications in agriculture, genetics, industry, medicine and the physical sciences, and papers on design issues (e.g. in relation to experiments, surveys or observational studies).
A deep understanding of statistical methodology is not necessary to appreciate the content. Although papers describing developments in statistical computing driven by practical examples are within its scope, the journal is not concerned with simply numerical illustrations or simulation studies. The emphasis of Series C is on case-studies of statistical analyses in practice.