Stefan G. Schreiber, Sanja Schreiber, R. Tanna, D. Roberts, T. Arciszewski
{"title":"Statistical tools for water quality assessment and monitoring in river ecosystems – a scoping review and recommendations for data analysis","authors":"Stefan G. Schreiber, Sanja Schreiber, R. Tanna, D. Roberts, T. Arciszewski","doi":"10.2166/wqrj.2022.028","DOIUrl":null,"url":null,"abstract":"\n Robust scientific inference is crucial to ensure evidence-based decision making. Accordingly, the selection of appropriate statistical tools and experimental designs is integral to achieve accuracy from data analytical processes. Environmental monitoring of water quality has become increasingly common and widespread as a result of technological advances, leading to an abundance of datasets. We conducted a scoping review of the water quality literature and found that correlation and linear regression are by far the most used statistical tools. However, the accuracy of inferences drawn from ordinary least squares (OLS) techniques depends on a set of assumptions, most prominently: (a) independence among observations, (b) normally distributed errors, (c) equal variances of errors, and (d) balanced designs. Environmental data, however, are often faced with temporal and spatial dependencies, and unbalanced designs, thus making OLS techniques not suitable to provide valid statistical inferences. Generalized least squares (GLS), linear mixed-effect models (LMMs), and generalized linear mixed-effect models (GLMMs), as well as Bayesian data analyses, have been developed to better tackle these problems. Recent progress in the development of statistical software has made these approaches more accessible and user-friendly. We provide a high-level summary and practical guidance for those statistical techniques.","PeriodicalId":23720,"journal":{"name":"Water Quality Research Journal","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2022-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Quality Research Journal","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.2166/wqrj.2022.028","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"WATER RESOURCES","Score":null,"Total":0}
引用次数: 9
Abstract
Robust scientific inference is crucial to ensure evidence-based decision making. Accordingly, the selection of appropriate statistical tools and experimental designs is integral to achieve accuracy from data analytical processes. Environmental monitoring of water quality has become increasingly common and widespread as a result of technological advances, leading to an abundance of datasets. We conducted a scoping review of the water quality literature and found that correlation and linear regression are by far the most used statistical tools. However, the accuracy of inferences drawn from ordinary least squares (OLS) techniques depends on a set of assumptions, most prominently: (a) independence among observations, (b) normally distributed errors, (c) equal variances of errors, and (d) balanced designs. Environmental data, however, are often faced with temporal and spatial dependencies, and unbalanced designs, thus making OLS techniques not suitable to provide valid statistical inferences. Generalized least squares (GLS), linear mixed-effect models (LMMs), and generalized linear mixed-effect models (GLMMs), as well as Bayesian data analyses, have been developed to better tackle these problems. Recent progress in the development of statistical software has made these approaches more accessible and user-friendly. We provide a high-level summary and practical guidance for those statistical techniques.