{"title":"Multivariate Analysis for Characterization of\nAir Pollution Sources: Part 1 Prior Data Screening\nand Underlying Assumptions","authors":"Mohammed O. A. Mohammed","doi":"10.15244/pjoes/179919","DOIUrl":null,"url":null,"abstract":"There is a real need for comparability and consistency of findings obtained from different multivariate methods, based on different assumptions and sensitivity to data errors. This study aims to investigate essential aspects of data screening prior to analysis, particularly the detection of outliers, communalities, multicollinearity, and Kaiser-Meyer-Olkin (KMO) and Bartlett’s tests, and to examine the influence of changing test parameters such as the number of convergence, number of bootstrap runs, FPEAK value, and minimum value of coefficient of determination (R 2 ) on model results. Positive matrix factorization (PMF) and Unmix were applied to monitoring data collected from a receptor site. Findings of communalities estimate and multicollinearity indicated possible data errors in Ca, Cu, Na, and Mn, which affected the stability of source profiles. PMF detected biomass burning, coal combustion","PeriodicalId":510399,"journal":{"name":"Polish Journal of Environmental Studies","volume":" 11","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Polish Journal of Environmental Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15244/pjoes/179919","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
There is a real need for comparability and consistency of findings obtained from different multivariate methods, based on different assumptions and sensitivity to data errors. This study aims to investigate essential aspects of data screening prior to analysis, particularly the detection of outliers, communalities, multicollinearity, and Kaiser-Meyer-Olkin (KMO) and Bartlett’s tests, and to examine the influence of changing test parameters such as the number of convergence, number of bootstrap runs, FPEAK value, and minimum value of coefficient of determination (R 2 ) on model results. Positive matrix factorization (PMF) and Unmix were applied to monitoring data collected from a receptor site. Findings of communalities estimate and multicollinearity indicated possible data errors in Ca, Cu, Na, and Mn, which affected the stability of source profiles. PMF detected biomass burning, coal combustion