{"title":"Kauzalni ekološki model sjevernog Jadrana temeljem podataka EU projekta “LTER Northern Adriatic Sea”","authors":"Želimir Kurtanjek","doi":"10.15255/kui.2022.033","DOIUrl":null,"url":null,"abstract":"The aim of this work was to show possibilities of applied artificial intelligence methodologies and structural causal modelling (“Structural Causal Model”, SCM) with the object of gaining a scientific level contribution to the determination of functional causal dependencies in complex ecological systems. In this work, applied was SCM for the determination of dependencies of chlo rophyll concentration on physical and chemical parameters in the northern Adriatic Sea during the period 1965 to 2015. The experimental data are the outcome of the long-term and extensive investigation as a part of the EU project “LTER Northern Adriatic Sea”, and are freely available within the EU Open Science policy. The data are a “Big Data” base with 108 687 samples and 43 descriptors. Proposed is a mathematical model with Bayes network (BN) as a directed acy - clic graph (DAG). The model structure was determined by the Hamilton-Schmidt conditional independence test with a significance level of α = 0.05. The SCM model shows that the direct causal variables for chlorophyll concentration are: temperature, salinity, pH, and concentrations of nitrogen, phosphor, and silica. The BN model was adjusted according to d-separation with the objective to block confounding and contra-causal back door interference. The functions of causal dependencies were determined as the marginal distributions with Bayes network models with a single interior layer for interpolation. The most important causal effect was due to temperature (−0.07 μg chlorophyll A/°C). The model predicted reversed positive causality between chloro phyll concentration and dissolved oxygen (0.2 mg DO 2 /μg chlorophyll A). Also evaluated was nonparametric comparative analysis of chlorophyll and abiotic parameters between Croatian and northern Adriatic Sea (Slovenia and Italy). The comparison was based on median metrics to avoid the pronounced influence of outliers due to hydrodynamic effects. The median concentration of dissolved oxygen in Croatian Adriatic was 5.8 mg O 2 /l, while in Slovenian and Italian 5.5 mg O 2 /l, and the median temperature was T = 14.6 °C compared to T = 15.1 °C. There is a significant difference in the abundance of dinoflagellates in Croatia 3 cell/l, while in Slovenia and Italian 5 cells/l. The difference is more pronounced by the number and values of “hot spots” outliers. The difference between chlorophyll concentrations is not significant (0.65 and 0.90 μg l −1 ); however, the difference in the distribution of the outliers is significant with more frequent and bigger outliers in Italian and Slovenian Adriatic. Also observed was a significant difference in SiO 4 distribution, with higher concentrations in the western Adriatic. The random forest RF decision tree models are applied for the development of the predictive models of biological parameters based on abiotic data. The RF models are validated by 5-fold cross-validation. The models have out-of-box mean relative errors of 6.5 % for chlorophyll, photopigment 17.4 %; diatoms 18.8 %; dinoflagellate 17.4 %; and 12.1 % for coccolithophores. For each predictive model determined are the first five most important predictors accounting for 95 % of importance.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15255/kui.2022.033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The aim of this work was to show possibilities of applied artificial intelligence methodologies and structural causal modelling (“Structural Causal Model”, SCM) with the object of gaining a scientific level contribution to the determination of functional causal dependencies in complex ecological systems. In this work, applied was SCM for the determination of dependencies of chlo rophyll concentration on physical and chemical parameters in the northern Adriatic Sea during the period 1965 to 2015. The experimental data are the outcome of the long-term and extensive investigation as a part of the EU project “LTER Northern Adriatic Sea”, and are freely available within the EU Open Science policy. The data are a “Big Data” base with 108 687 samples and 43 descriptors. Proposed is a mathematical model with Bayes network (BN) as a directed acy - clic graph (DAG). The model structure was determined by the Hamilton-Schmidt conditional independence test with a significance level of α = 0.05. The SCM model shows that the direct causal variables for chlorophyll concentration are: temperature, salinity, pH, and concentrations of nitrogen, phosphor, and silica. The BN model was adjusted according to d-separation with the objective to block confounding and contra-causal back door interference. The functions of causal dependencies were determined as the marginal distributions with Bayes network models with a single interior layer for interpolation. The most important causal effect was due to temperature (−0.07 μg chlorophyll A/°C). The model predicted reversed positive causality between chloro phyll concentration and dissolved oxygen (0.2 mg DO 2 /μg chlorophyll A). Also evaluated was nonparametric comparative analysis of chlorophyll and abiotic parameters between Croatian and northern Adriatic Sea (Slovenia and Italy). The comparison was based on median metrics to avoid the pronounced influence of outliers due to hydrodynamic effects. The median concentration of dissolved oxygen in Croatian Adriatic was 5.8 mg O 2 /l, while in Slovenian and Italian 5.5 mg O 2 /l, and the median temperature was T = 14.6 °C compared to T = 15.1 °C. There is a significant difference in the abundance of dinoflagellates in Croatia 3 cell/l, while in Slovenia and Italian 5 cells/l. The difference is more pronounced by the number and values of “hot spots” outliers. The difference between chlorophyll concentrations is not significant (0.65 and 0.90 μg l −1 ); however, the difference in the distribution of the outliers is significant with more frequent and bigger outliers in Italian and Slovenian Adriatic. Also observed was a significant difference in SiO 4 distribution, with higher concentrations in the western Adriatic. The random forest RF decision tree models are applied for the development of the predictive models of biological parameters based on abiotic data. The RF models are validated by 5-fold cross-validation. The models have out-of-box mean relative errors of 6.5 % for chlorophyll, photopigment 17.4 %; diatoms 18.8 %; dinoflagellate 17.4 %; and 12.1 % for coccolithophores. For each predictive model determined are the first five most important predictors accounting for 95 % of importance.