Practical considerations on data patterns in Bayesian Maximum Entropy Estimation: A systematic and critical review

Journal of Applied Biosciences Pub Date : 2023-01-31 DOI:10.35759/jabs.181.1

Emmanuel Ehnon Gongnet, R. Vihotogbe, Tranquillin Affossogbe Sédjro Antoine, R. G. Glèlè Kakaï

{"title":"Practical considerations on data patterns in Bayesian Maximum Entropy Estimation: A systematic and critical review","authors":"Emmanuel Ehnon Gongnet, R. Vihotogbe, Tranquillin Affossogbe Sédjro Antoine, R. G. Glèlè Kakaï","doi":"10.35759/jabs.181.1","DOIUrl":null,"url":null,"abstract":"Objective: It is well known that some data features (sample size, skewness, among others) may determine method performance. The choice of those features depends on the researcher’s level of awareness on the statistical method. In this study, the level of awareness on the influence of spatial data key characteristics (sample size, skewness, spatial dependency and variogram model) in Bayesian Maximum Entropy (BME) was analyzed. Methodology: A systematic review was conducted that covers the period from 1990 (year of BME introduction) to 2019. Two main keywords “Bayesian Maximum Entropy” and “BME” were used for literature search. Publications which only mentioned the keywords without applying BME were excluded while those with application and/or BME theory discussion were considered. Six of the world’s leading Open Access sources of scientific literature were considered, namely: Science Direct, African Journals Online, Springer, Google Scholar, MPDI and Academic Journals. A total of 118 research articles from 62 journals were identified. The sample sizes screened shows that 25.4% of the published articles used few samples (less than 100), which implies the variogram might not yield accurate results. The analysis of the use of skewness showed that most researchers do not apply transformation on skewed data (82.2%) nor consider skewness in their descriptive statistics (90.7%). Even though 11% of theoretical papers have mentioned about spatial dependency level, 92.4% of them failed to consider it. Most researchers (68.64%) do not specify the variogram models but when they do, they mostly use exponential model (12.7%). It clearly appears in this review that most researchers do not consider the effect of sample size, skewness, and spatial dependency level when applying BME. Yet very few research works have focused on these aspects. This therefore calls for more in-depth studies on the effect of data characteristics on BME’s performance. Keywords: Bayesian Maximum Entropy, sample size, skewness, spatial dependency","PeriodicalId":14998,"journal":{"name":"Journal of Applied Biosciences","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Biosciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35759/jabs.181.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: It is well known that some data features (sample size, skewness, among others) may determine method performance. The choice of those features depends on the researcher’s level of awareness on the statistical method. In this study, the level of awareness on the influence of spatial data key characteristics (sample size, skewness, spatial dependency and variogram model) in Bayesian Maximum Entropy (BME) was analyzed. Methodology: A systematic review was conducted that covers the period from 1990 (year of BME introduction) to 2019. Two main keywords “Bayesian Maximum Entropy” and “BME” were used for literature search. Publications which only mentioned the keywords without applying BME were excluded while those with application and/or BME theory discussion were considered. Six of the world’s leading Open Access sources of scientific literature were considered, namely: Science Direct, African Journals Online, Springer, Google Scholar, MPDI and Academic Journals. A total of 118 research articles from 62 journals were identified. The sample sizes screened shows that 25.4% of the published articles used few samples (less than 100), which implies the variogram might not yield accurate results. The analysis of the use of skewness showed that most researchers do not apply transformation on skewed data (82.2%) nor consider skewness in their descriptive statistics (90.7%). Even though 11% of theoretical papers have mentioned about spatial dependency level, 92.4% of them failed to consider it. Most researchers (68.64%) do not specify the variogram models but when they do, they mostly use exponential model (12.7%). It clearly appears in this review that most researchers do not consider the effect of sample size, skewness, and spatial dependency level when applying BME. Yet very few research works have focused on these aspects. This therefore calls for more in-depth studies on the effect of data characteristics on BME’s performance. Keywords: Bayesian Maximum Entropy, sample size, skewness, spatial dependency

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

对贝叶斯最大熵估计中数据模式的实际考虑:一个系统和批判性的回顾

目的:众所周知，一些数据特征(样本量、偏度等)可能决定方法的性能。这些特征的选择取决于研究人员对统计方法的认识水平。本研究分析了空间数据关键特征(样本量、偏度、空间依赖性和变异函数模型)对贝叶斯最大熵(BME)影响的认识水平。方法:对1990年(引入BME之年)至2019年期间进行了系统回顾。使用“贝叶斯最大熵”和“BME”两个主要关键词进行文献检索。只提到关键词而没有应用BME的出版物被排除，而有应用和/或BME理论讨论的出版物被考虑。世界上六个主要的开放获取科学文献来源被考虑，即:Science Direct、African Journals Online、Springer、Google Scholar、MPDI和学术期刊。共有来自62种期刊的118篇研究论文被确定。筛选的样本量显示，25.4%的已发表文章使用的样本较少(少于100个)，这意味着变异图可能无法产生准确的结果。对偏度使用的分析表明，大多数研究人员没有对偏态数据进行转换(82.2%)，也没有在描述性统计中考虑偏度(90.7%)。尽管有11%的理论论文提到了空间依赖程度，但有92.4%的理论论文没有考虑到空间依赖程度。大多数研究者(68.64%)没有指定变异函数模型，但在指定变异函数模型时，他们大多使用指数模型(12.7%)。这篇综述清楚地表明，大多数研究人员在应用BME时没有考虑样本量、偏度和空间依赖水平的影响。然而，很少有研究工作集中在这些方面。因此，需要对数据特征对BME性能的影响进行更深入的研究。关键词:贝叶斯最大熵，样本量，偏度，空间依赖性

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Applied Biosciences

自引率

0.00%

发文量