Grain-size distribution unmixing using the R package EMMAgeo

E&G Quaternary Science Journal Pub Date : 2019-05-16 DOI:10.5194/EGQSJ-68-29-2019

E. Dietze, M. Dietze

{"title":"Grain-size distribution unmixing using the R package EMMAgeo","authors":"E. Dietze, M. Dietze","doi":"10.5194/EGQSJ-68-29-2019","DOIUrl":null,"url":null,"abstract":"Abstract. The analysis of grain-size distributions has a long tradition in\nQuaternary Science and disciplines studying Earth surface and subsurface\ndeposits. The decomposition of multi-modal grain-size distributions into\ninherent subpopulations, commonly termed end-member modelling analysis\n(EMMA), is increasingly recognised as a tool to infer the underlying\nsediment sources, transport and (post-)depositional processes. Most of the\nexisting deterministic EMMA approaches are only able to deliver one out of\nmany possible solutions, thereby shortcutting uncertainty in model\nparameters. Here, we provide user-friendly computational protocols that\nsupport deterministic as well as robust (i.e. explicitly accounting for\nincomplete knowledge about input parameters in a probabilistic approach)\nEMMA, in the free and open software framework of R. In addition, and going beyond previous validation tests, we compare the\nperformance of available grain-size EMMA algorithms using four real-world\nsediment types, covering a wide range of grain-size distribution shapes\n(alluvial fan, dune, loess and floodplain deposits). These were randomly\nmixed in the lab to produce a synthetic data set. Across all algorithms, the\noriginal data set was modelled with mean R2 values of 0.868 to 0.995\nand mean absolute deviation (MAD) values of 0.06 % vol to 0.34 % vol. The original\ngrain-size distribution shapes were modelled as end-member loadings with\nmean R2 values of 0.89 to 0.99 and MAD of 0.04 % vol to 0.17 % vol. End-member scores reproduced the original mixing ratios in the\nsynthetic data set with mean R2 values of 0.68 to 0.93 and MAD\nof 0.1 % vol to 1.6 % vol. Depending on the validation criteria, all models\nprovided reliable estimates of the input data, and each of the models\nexhibits individual strengths and weaknesses. Only robust EMMA allowed uncertainties of the end-members to\nbe objectively estimated and expert knowledge to be included in the end-member definition. Yet, end-member interpretation should\ncarefully consider the geological and sedimentological meaningfulness in\nterms of sediment sources, transport and deposition as well as\npost-depositional alteration of grain sizes. EMMA might also be powerful in\nother geoscientific contexts where the goal is to unmix sources and\nprocesses from compositional data sets.\n","PeriodicalId":11420,"journal":{"name":"E&G Quaternary Science Journal","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"69","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"E&G Quaternary Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/EGQSJ-68-29-2019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 69

Abstract

Abstract. The analysis of grain-size distributions has a long tradition in Quaternary Science and disciplines studying Earth surface and subsurface deposits. The decomposition of multi-modal grain-size distributions into inherent subpopulations, commonly termed end-member modelling analysis (EMMA), is increasingly recognised as a tool to infer the underlying sediment sources, transport and (post-)depositional processes. Most of the existing deterministic EMMA approaches are only able to deliver one out of many possible solutions, thereby shortcutting uncertainty in model parameters. Here, we provide user-friendly computational protocols that support deterministic as well as robust (i.e. explicitly accounting for incomplete knowledge about input parameters in a probabilistic approach) EMMA, in the free and open software framework of R. In addition, and going beyond previous validation tests, we compare the performance of available grain-size EMMA algorithms using four real-world sediment types, covering a wide range of grain-size distribution shapes (alluvial fan, dune, loess and floodplain deposits). These were randomly mixed in the lab to produce a synthetic data set. Across all algorithms, the original data set was modelled with mean R2 values of 0.868 to 0.995 and mean absolute deviation (MAD) values of 0.06 % vol to 0.34 % vol. The original grain-size distribution shapes were modelled as end-member loadings with mean R2 values of 0.89 to 0.99 and MAD of 0.04 % vol to 0.17 % vol. End-member scores reproduced the original mixing ratios in the synthetic data set with mean R2 values of 0.68 to 0.93 and MAD of 0.1 % vol to 1.6 % vol. Depending on the validation criteria, all models provided reliable estimates of the input data, and each of the models exhibits individual strengths and weaknesses. Only robust EMMA allowed uncertainties of the end-members to be objectively estimated and expert knowledge to be included in the end-member definition. Yet, end-member interpretation should carefully consider the geological and sedimentological meaningfulness in terms of sediment sources, transport and deposition as well as post-depositional alteration of grain sizes. EMMA might also be powerful in other geoscientific contexts where the goal is to unmix sources and processes from compositional data sets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

粒度分布分解使用R包EMMAgeo

摘要粒度分布分析在第四纪科学和研究地表和地下矿床的学科中有着悠久的传统。多模态粒度分布分解为固有亚群，通常被称为端元模拟分析(EMMA)，越来越被认为是推断潜在沉积物来源、运输和(后)沉积过程的工具。大多数现有的确定性EMMA方法只能提供许多可能解决方案中的一个，从而缩短了模型参数的不确定性。在这里，我们提供了用户友好的计算协议，支持确定性和鲁强(即明确地考虑在概率方法中输入参数的不完整知识)EMMA，在r的免费和开放的软件框架中。此外，超越之前的验证测试，我们比较了使用四种真实世界沉积物类型的可用粒度EMMA算法的性能，涵盖了广泛的粒度分布形状(冲积扇，沙丘，黄土和洪泛平原沉积物)。这些数据在实验室中被随机混合，以产生一个合成数据集。在所有算法中，原始数据集的平均R2值为0.868至0.995，平均绝对偏差(MAD)值为0.06%至0.34% vol。原始粒度分布形状建模为端元载荷，平均R2值为0.89 ~ 0.99,MAD为0.04% ~ 0.17% vol。端元评分再现了合成数据集中的原始混合比率，平均R2值为0.68至0.93,ma为0.1% vol至1.6% vol。根据验证标准，所有模型都提供了对输入数据的可靠估计，并且每个模型都展示了各自的优点和缺点。只有鲁棒EMMA才能客观估计端元的不确定性，并在端元定义中包含专家知识。然而，端元解释应仔细考虑沉积物来源、搬运和沉积以及沉积后粒度变化等方面的地质和沉积意义。EMMA在其他地球科学环境中也很强大，这些环境的目标是从组成数据集中分离出来源和过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

E&G Quaternary Science Journal

自引率

0.00%

发文量