Results of a convincing causal statistical inference related to socio-economic phenomena are treated as especially desired background for conducting various socio-economic programs or government interventions. Unfortunately, quite often real socio-economic issues do not fulfill restrictive assumptions of procedures of causal analysis proposed in the literature. This paper indicates certain empirical challenges and conceptual opportunities related to applications of procedures of data depth concept into a process of causal inference as to socio-economic phenomena. We show, how to apply a statistical functional depths in order to indicate factual and counterfactual distributions commonly used within procedures of causal inference. Thus a modification of Rubin causality concept is proposed, i.e., a centrality-oriented causality concept. The presented framework is especially useful in a context of conducting causal inference basing on official statistics, i.e., basing on already existing databases. Methodological considerations related to extremal depth, modified band depth, Fraiman-Muniz depth, and multivariate Wilcoxon sum rank statistic are illustrated by means of example related to a study of an impact of EU direct agricultural subsidies on a digital development in Poland in a period of 2012-2019.
{"title":"Centrality-oriented causality. A study of EU agricultural subsidies and digital development in Poland","authors":"K. Daniel, J. Rydlewski","doi":"10.37190/ord200303","DOIUrl":"https://doi.org/10.37190/ord200303","url":null,"abstract":"Results of a convincing causal statistical inference related to socio-economic phenomena are treated as especially desired background for conducting various socio-economic programs or government interventions. Unfortunately, quite often real socio-economic issues do not fulfill restrictive assumptions of procedures of causal analysis proposed in the literature. This paper indicates certain empirical challenges and conceptual opportunities related to applications of procedures of data depth concept into a process of causal inference as to socio-economic phenomena. We show, how to apply a statistical functional depths in order to indicate factual and counterfactual distributions commonly used within procedures of causal inference. Thus a modification of Rubin causality concept is proposed, i.e., a centrality-oriented causality concept. The presented framework is especially useful in a context of conducting causal inference basing on official statistics, i.e., basing on already existing databases. Methodological considerations related to extremal depth, modified band depth, Fraiman-Muniz depth, and multivariate Wilcoxon sum rank statistic are illustrated by means of example related to a study of an impact of EU direct agricultural subsidies on a digital development in Poland in a period of 2012-2019.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114706585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sea ice, or frozen ocean water, annually freezes and melts in the Arctic. The need for accurate forecasts of where sea ice will be located weeks to months in advance has increased as the amount of sea ice reduces due to climate change. Typical sea ice forecasts are made with ensemble models, physics-based deterministic models of sea ice and the surrounding ocean and atmosphere. This paper introduces Mixture Contour Forecasting, a method to forecast sea ice that post-processes output from ensembles and weights them with observed sea ice patterns in recent years. These forecasts are better calibrated than unadjusted dynamic ensemble forecasts and other statistical reference forecasts. To produce these forecasts, a novel statistical technique is introduced that directly models the sea ice edge contour, the boundary around the region that is ice-covered. Most of the computational effort in post-processing can therefore be placed on the sea ice edge contour, which is of particular importance due to its role in maritime planning. Mixture Contour Forecasting and reference methods are evaluated for monthly sea ice forecasts for 2008-2016 at lead times ranging from 0.5-6.5 months using the European Centre for Medium-Range Weather Forecasts ensemble.
{"title":"Probabilistic forecasting of the Arctic sea ice edge with contour modeling","authors":"Hannah M. Director, A. Raftery, C. Bitz","doi":"10.1214/20-aoas1405","DOIUrl":"https://doi.org/10.1214/20-aoas1405","url":null,"abstract":"Sea ice, or frozen ocean water, annually freezes and melts in the Arctic. The need for accurate forecasts of where sea ice will be located weeks to months in advance has increased as the amount of sea ice reduces due to climate change. Typical sea ice forecasts are made with ensemble models, physics-based deterministic models of sea ice and the surrounding ocean and atmosphere. This paper introduces Mixture Contour Forecasting, a method to forecast sea ice that post-processes output from ensembles and weights them with observed sea ice patterns in recent years. These forecasts are better calibrated than unadjusted dynamic ensemble forecasts and other statistical reference forecasts. To produce these forecasts, a novel statistical technique is introduced that directly models the sea ice edge contour, the boundary around the region that is ice-covered. Most of the computational effort in post-processing can therefore be placed on the sea ice edge contour, which is of particular importance due to its role in maritime planning. Mixture Contour Forecasting and reference methods are evaluated for monthly sea ice forecasts for 2008-2016 at lead times ranging from 0.5-6.5 months using the European Centre for Medium-Range Weather Forecasts ensemble.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"390 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132878765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Numerical climate models are complex and combine a large number of physical processes. They are key tools in quantifying the relative contribution of potential anthropogenic causes (e.g., the current increase in greenhouse gases) on high impact atmospheric variables like heavy rainfall. These so-called climate extreme event attribution problems are particularly challenging in a multivariate context, that is, when the atmospheric variables are measured on a possibly high-dimensional grid. In this paper, we leverage two statistical theories to assess causality in the context of multivariate extreme event attribution. As we consider an event to be extreme when at least one of the components of the vector of interest is large, extreme-value theory justifies, in an asymptotical sense, a multivariate generalized Pareto distribution to model joint extremes. Under this class of distributions, we derive and study probabilities of necessary and sufficient causation as defined by the counterfactual theory of Pearl. To increase causal evidence, we propose a dimension reduction strategy based on the optimal linear projection that maximizes such causation probabilities. Our approach is tested on simulated examples and applied to weekly winter maxima precipitation outputs of the French CNRM from the recent CMIP6 experiment.
{"title":"Climate extreme event attribution using multivariate peaks-over-thresholds modeling and counterfactual theory","authors":"A. Kiriliouk, P. Naveau","doi":"10.1214/20-aoas1355","DOIUrl":"https://doi.org/10.1214/20-aoas1355","url":null,"abstract":"Numerical climate models are complex and combine a large number of physical processes. They are key tools in quantifying the relative contribution of potential anthropogenic causes (e.g., the current increase in greenhouse gases) on high impact atmospheric variables like heavy rainfall. These so-called climate extreme event attribution problems are particularly challenging in a multivariate context, that is, when the atmospheric variables are measured on a possibly high-dimensional grid. \u0000In this paper, we leverage two statistical theories to assess causality in the context of multivariate extreme event attribution. As we consider an event to be extreme when at least one of the components of the vector of interest is large, extreme-value theory justifies, in an asymptotical sense, a multivariate generalized Pareto distribution to model joint extremes. Under this class of distributions, we derive and study probabilities of necessary and sufficient causation as defined by the counterfactual theory of Pearl. To increase causal evidence, we propose a dimension reduction strategy based on the optimal linear projection that maximizes such causation probabilities. Our approach is tested on simulated examples and applied to weekly winter maxima precipitation outputs of the French CNRM from the recent CMIP6 experiment.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130002006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-04-16DOI: 10.22237/jmasm/1608552120
Albee Y. Ling, M. Montez-Rath, Maya B. Mathur, K. Kapphahn, M. Desai
Propensity score matching (PSM) has been widely used to mitigate confounding in observational studies, although complications arise when the covariates used to estimate the PS are only partially observed. Multiple imputation (MI) is a potential solution for handling missing covariates in the estimation of the PS. Unfortunately, it is not clear how to best apply MI strategies in the context of PSM. We conducted a simulation study to compare the performances of popular non-MI missing data methods and various MI-based strategies under different missing data mechanisms (MDMs). We found that commonly applied missing data methods resulted in biased and inefficient estimates, and we observed large variation in performance across MI-based strategies. Based on our findings, we recommend 1) deriving the PS after applying MI (referred to as MI-derPassive); 2) conducting PSM within each imputed data set followed by averaging the treatment effects to arrive at one summarized finding (INT-within) for mild MDMs and averaging the PSs across multiply imputed datasets before obtaining one treatment effect using PSM (INT-across) for more complex MDMs; 3) a bootstrapped-based variance to account for uncertainty of PS estimation, matching, and imputation; and 4) inclusion of key auxiliary variables in the imputation model.
{"title":"How to Apply Multiple Imputation in Propensity Score Matching with Partially Observed Confounders: A Simulation Study and Practical Recommendations","authors":"Albee Y. Ling, M. Montez-Rath, Maya B. Mathur, K. Kapphahn, M. Desai","doi":"10.22237/jmasm/1608552120","DOIUrl":"https://doi.org/10.22237/jmasm/1608552120","url":null,"abstract":"Propensity score matching (PSM) has been widely used to mitigate confounding in observational studies, although complications arise when the covariates used to estimate the PS are only partially observed. Multiple imputation (MI) is a potential solution for handling missing covariates in the estimation of the PS. Unfortunately, it is not clear how to best apply MI strategies in the context of PSM. We conducted a simulation study to compare the performances of popular non-MI missing data methods and various MI-based strategies under different missing data mechanisms (MDMs). We found that commonly applied missing data methods resulted in biased and inefficient estimates, and we observed large variation in performance across MI-based strategies. Based on our findings, we recommend 1) deriving the PS after applying MI (referred to as MI-derPassive); 2) conducting PSM within each imputed data set followed by averaging the treatment effects to arrive at one summarized finding (INT-within) for mild MDMs and averaging the PSs across multiply imputed datasets before obtaining one treatment effect using PSM (INT-across) for more complex MDMs; 3) a bootstrapped-based variance to account for uncertainty of PS estimation, matching, and imputation; and 4) inclusion of key auxiliary variables in the imputation model.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133090647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-05-16DOI: 10.11908/j.issn.0253-374x.2016.01.012
Xuesong Wang, Jinghui Yuan, Xiaohan Yang
Approach-level models were developed to accommodate the diversity of approaches within the same intersection. A random effect term, which indicates the intersection-specific effect, was incorporated into each crash type model to deal with the spatial correlation between different approaches within the same intersection. The model parameters were estimated under the Bayesian framework. Results show that different crash types are correlated with different groups of factors, and each factor shows diverse effects on different crash types, which indicates the importance of crash type models. Besides, the significance of random effect term confirms the existence of spatial correlations among different approaches within the same intersection.
{"title":"Modelling of crash types at signalized intersections based on random effect model","authors":"Xuesong Wang, Jinghui Yuan, Xiaohan Yang","doi":"10.11908/j.issn.0253-374x.2016.01.012","DOIUrl":"https://doi.org/10.11908/j.issn.0253-374x.2016.01.012","url":null,"abstract":"Approach-level models were developed to accommodate the diversity of approaches within the same intersection. A random effect term, which indicates the intersection-specific effect, was incorporated into each crash type model to deal with the spatial correlation between different approaches within the same intersection. The model parameters were estimated under the Bayesian framework. Results show that different crash types are correlated with different groups of factors, and each factor shows diverse effects on different crash types, which indicates the importance of crash type models. Besides, the significance of random effect term confirms the existence of spatial correlations among different approaches within the same intersection.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123927004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adway Mitra, A. Apte, R. Govindarajan, V. Vasan, S. Vadlamani
We propose a representation of the Indian summer monsoon rainfall in terms of a probabilistic model based on a Markov Random Field, consisting of discrete state variables representing low and high rainfall at grid-scale and daily rainfall patterns across space and in time. These discrete states are conditioned on observed daily gridded rainfall data from the period 2000-2007. The model gives us a set of 10 spatial patterns of daily monsoon rainfall over India, which are robust over a range of user-chosen parameters as well as coherent in space and time. Each day in the monsoon season is assigned precisely one of the spatial patterns, that approximates the spatial distribution of rainfall on that day. Such approximations are quite accurate for nearly 95% of the days. Remarkably, these patterns are representative (with similar accuracy) of the monsoon seasons from 1901 to 2000 as well. Finally, we compare the proposed model with alternative approaches to extract spatial patterns of rainfall, using empirical orthogonal functions as well as clustering algorithms such as K-means and spectral clustering.
{"title":"A Discrete View of the Indian Monsoon to Identify Spatial Patterns of Rainfall","authors":"Adway Mitra, A. Apte, R. Govindarajan, V. Vasan, S. Vadlamani","doi":"10.1093/CLIMSYS/DZY009","DOIUrl":"https://doi.org/10.1093/CLIMSYS/DZY009","url":null,"abstract":"We propose a representation of the Indian summer monsoon rainfall in terms of a probabilistic model based on a Markov Random Field, consisting of discrete state variables representing low and high rainfall at grid-scale and daily rainfall patterns across space and in time. These discrete states are conditioned on observed daily gridded rainfall data from the period 2000-2007. The model gives us a set of 10 spatial patterns of daily monsoon rainfall over India, which are robust over a range of user-chosen parameters as well as coherent in space and time. Each day in the monsoon season is assigned precisely one of the spatial patterns, that approximates the spatial distribution of rainfall on that day. Such approximations are quite accurate for nearly 95% of the days. Remarkably, these patterns are representative (with similar accuracy) of the monsoon seasons from 1901 to 2000 as well. Finally, we compare the proposed model with alternative approaches to extract spatial patterns of rainfall, using empirical orthogonal functions as well as clustering algorithms such as K-means and spectral clustering.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132338314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study was conducted to determine the structures of a set of n correlated variables and creates a new set of uncorrelated indices which are the underlying components of the Philippine health data.The data utilized in this study was the 2009 Philippine Health Data which was made available by National Statistical Coordination Board(NSCB) in its 2009 publication.The publication contains the health data of 81 provinces of the Philippines consisting of ten system-related determinants which was considered as the variables in this study. From the ten health system-related determinants, it was found out that there are three significant underlying components that could summarize the Philippine health data. The first component was named as importance of safe water supply and emphasis on child heat while the second and third component were named as importance of Barangay Health Stations, government health workers and emphasis on pregnant women's health and emphasis on women's health, respectively. These three components jointly account for a total of 73.01% of the total variance explained by the component.
{"title":"Principal Component Analysis on the Philippine Health Data.","authors":"Marites F. Carillo, F. Largo, Roel F Ceballos","doi":"10.2139/ssrn.3339627","DOIUrl":"https://doi.org/10.2139/ssrn.3339627","url":null,"abstract":"This study was conducted to determine the structures of a set of n correlated variables and creates a new set of uncorrelated indices which are the underlying components of the Philippine health data.The data utilized in this study was the 2009 Philippine Health Data which was made available by National Statistical Coordination Board(NSCB) in its 2009 publication.The publication contains the health data of 81 provinces of the Philippines consisting of ten system-related determinants which was considered as the variables in this study. From the ten health system-related determinants, it was found out that there are three significant underlying components that could summarize the Philippine health data. The first component was named as importance of safe water supply and emphasis on child heat while the second and third component were named as importance of Barangay Health Stations, government health workers and emphasis on pregnant women's health and emphasis on women's health, respectively. These three components jointly account for a total of 73.01% of the total variance explained by the component.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125151341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long period (LP) earthquakes are common at active volcanoes, and are ubiquitous at persistently active andesitic and dacitic subduction zone volcanoes. They provide critical information regarding the state of volcanic unrest, and their occurrence rates are key data for eruption forecasting. LPs are commonly quasi-periodic or 'anti-clustered', unlike volcano-tectonic (VT) earthquakes, so the existing Poisson point process methods used to model occurrence rates of VT earthquakes are unlikely to be optimal for LP data. We evaluate the performance of candidate formulations for LP data, based on inhomogeneous point process models with four different inter-event time distributions: exponential (IP), Gamma (IG), inverse Gaussian (IIG), and Weibull (IW). We examine how well these models explain the observed data, and the quality of retrospective forecasts of eruption time. We use a Bayesian MCMC approach to fit the models. Goodness-of-fit is assessed using Quantile-Quantile and Kolmogorov-Smirnov methods, and benchmarking against results obtained from synthetic datasets. IG and IIG models were both found to fit the data well, with the IIG model slightly outperforming the IG model. Retrospective forecasting analysis shows that the IG model performs best, with the initial preference for the IIG model controlled by catalogue incompleteness late in the sequence. The IG model fits the data significantly better than the IP model, and simulations show it produces better forecasts for highly periodic data. Simulations also show that forecast precision increases with the degree of periodicity of the earthquake process using the IG model, and so should be better for LP earthquakes than VTs. These results provide a new framework for point process modelling of volcanic earthquake time series, and verification of alternative models.
{"title":"Point process models for quasi-periodic volcanic earthquakes","authors":"A. Ignatieva, A. Bell, B. Worton","doi":"10.5038/2163-338X.4.2","DOIUrl":"https://doi.org/10.5038/2163-338X.4.2","url":null,"abstract":"Long period (LP) earthquakes are common at active volcanoes, and are ubiquitous at persistently active andesitic and dacitic subduction zone volcanoes. They provide critical information regarding the state of volcanic unrest, and their occurrence rates are key data for eruption forecasting. LPs are commonly quasi-periodic or 'anti-clustered', unlike volcano-tectonic (VT) earthquakes, so the existing Poisson point process methods used to model occurrence rates of VT earthquakes are unlikely to be optimal for LP data. We evaluate the performance of candidate formulations for LP data, based on inhomogeneous point process models with four different inter-event time distributions: exponential (IP), Gamma (IG), inverse Gaussian (IIG), and Weibull (IW). We examine how well these models explain the observed data, and the quality of retrospective forecasts of eruption time. We use a Bayesian MCMC approach to fit the models. Goodness-of-fit is assessed using Quantile-Quantile and Kolmogorov-Smirnov methods, and benchmarking against results obtained from synthetic datasets. IG and IIG models were both found to fit the data well, with the IIG model slightly outperforming the IG model. Retrospective forecasting analysis shows that the IG model performs best, with the initial preference for the IIG model controlled by catalogue incompleteness late in the sequence. The IG model fits the data significantly better than the IP model, and simulations show it produces better forecasts for highly periodic data. Simulations also show that forecast precision increases with the degree of periodicity of the earthquake process using the IG model, and so should be better for LP earthquakes than VTs. These results provide a new framework for point process modelling of volcanic earthquake time series, and verification of alternative models.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131748581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Q. Vuong, Manh-Tung Ho, Viet-Phuong La, D. Nhue, Bui Quang Khiem, Nghiem Phu Kien Cuong, Thu-Trang Vuong, Manh-Toan Ho, H. Nguyen, Viet-Ha T. Nguyen, Hiep-Hung Pham, N. Napier
Every year, the Vietnamese people reportedly burned about 50,000 tons of joss papers, which took the form of not only bank notes, but iPhones, cars, clothes, even housekeepers, in hope of pleasing the dead. The practice was mistakenly attributed to traditional Buddhist teachings but originated in fact from China, which most Vietnamese were not aware of. In other aspects of life, there were many similar examples of Vietnamese so ready and comfortable with adding new norms, values, and beliefs, even contradictory ones, to their culture. This phenomenon, dubbed "cultural additivity", prompted us to study the co-existence, interaction, and influences among core values and norms of the Three Teachings –Confucianism, Buddhism, and Taoism–as shown through Vietnamese folktales. By applying Bayesian logistic regression, we evaluated the possibility of whether the key message of a story was dominated by a religion (dependent variables), as affected by the appearance of values and anti-values pertaining to the Three Teachings in the story (independent variables). Our main findings included the existence of the cultural additivity of Confucian and Taoist values. More specifically, empirical results showed that the interaction or addition of the values of Taoism and Confucianism in folktales together helped predict whether the key message of a story was about Confucianism, β{VT ⋅ VC} = 0.86. Meanwhile, there was no such statistical tendency for Buddhism. The results lead to a number of important implications. First, this showed the dominance of Confucianism because the fact that Confucian and Taoist values appeared together in a story led to the story’s key message dominated by Confucianism. Thus, it presented the evidence of Confucian dominance and against liberal interpretations of the concept of the Common Roots of Three Religions ("tam giao đồng nguyen") as religious unification or unicity. Second, the concept of "cultural additivity" could help explain many interesting socio-cultural phenomena, namely the absence of religious intolerance and extremism in the Vietnamese society, outrageous cases of sophistry in education, the low productivity in creative endeavors like science and technology, the misleading branding strategy in business. We are aware that our results are only preliminary and more studies, both theoretical and empirical, must be carried out to give a full account of the explanatory reach of "cultural additivity".
据报道,越南人每年焚烧约5万吨纸钱,这些纸钱的形式不仅包括纸币,还包括iphone、汽车、衣服,甚至家政人员,希望以此来取悦死者。这种做法被错误地归因于传统的佛教教义,但实际上起源于中国,大多数越南人并不知道这一点。在生活的其他方面,也有许多类似的例子,越南人乐于为自己的文化增添新的规范、价值观和信仰,甚至是相互矛盾的东西。这种现象被称为“文化可加性”,促使我们通过越南民间故事来研究儒、释、道三教的核心价值和规范之间的共存、互动和影响。通过应用贝叶斯逻辑回归,我们评估了故事的关键信息是否由宗教(因变量)主导的可能性,是否受到故事中与三教有关的价值观和反价值观的影响(自变量)。我们的主要发现包括儒家和道教价值观的文化可加性的存在。更具体地说,实证结果表明,道教和儒家价值观在民间故事中的相互作用或共同作用有助于预测故事的关键信息是否与儒家有关,β{VT⋅VC} = 0.86。而佛教则没有这样的统计趋势。研究结果引出了一些重要的启示。首先,这表明儒家的主导地位,因为儒家和道家的价值观同时出现在一个故事中,导致故事的关键信息由儒家主导。因此,它提出了儒家主导地位的证据,反对自由主义对三教共同根源(“tam giao đồng nguyen”)概念的解释,认为这是宗教统一或单一的。其次,“文化可加性”的概念可以帮助解释许多有趣的社会文化现象,即越南社会中没有宗教不容忍和极端主义,教育中的荒谬诡辩,科学和技术等创造性活动的低生产率,以及商业中误导性的品牌战略。我们意识到,我们的结果只是初步的,必须进行更多的理论和实证研究,以充分说明“文化可加性”的解释范围。
{"title":"'Cultural Additivity' and How the Values and Norms of Confucianism, Buddhism, and Taoism Co-Exist, Interact, and Influence Vietnamese Society: A Bayesian Analysis of Long-Standing Folktales, Using R and Stan","authors":"Q. Vuong, Manh-Tung Ho, Viet-Phuong La, D. Nhue, Bui Quang Khiem, Nghiem Phu Kien Cuong, Thu-Trang Vuong, Manh-Toan Ho, H. Nguyen, Viet-Ha T. Nguyen, Hiep-Hung Pham, N. Napier","doi":"10.2139/SSRN.3134541","DOIUrl":"https://doi.org/10.2139/SSRN.3134541","url":null,"abstract":"Every year, the Vietnamese people reportedly burned about 50,000 tons of joss papers, which took the form of not only bank notes, but iPhones, cars, clothes, even housekeepers, in hope of pleasing the dead. The practice was mistakenly attributed to traditional Buddhist teachings but originated in fact from China, which most Vietnamese were not aware of. In other aspects of life, there were many similar examples of Vietnamese so ready and comfortable with adding new norms, values, and beliefs, even contradictory ones, to their culture. This phenomenon, dubbed \"cultural additivity\", prompted us to study the co-existence, interaction, and influences among core values and norms of the Three Teachings –Confucianism, Buddhism, and Taoism–as shown through Vietnamese folktales. By applying Bayesian logistic regression, we evaluated the possibility of whether the key message of a story was dominated by a religion (dependent variables), as affected by the appearance of values and anti-values pertaining to the Three Teachings in the story (independent variables). Our main findings included the existence of the cultural additivity of Confucian and Taoist values. More specifically, empirical results showed that the interaction or addition of the values of Taoism and Confucianism in folktales together helped predict whether the key message of a story was about Confucianism, β{VT ⋅ VC} = 0.86. Meanwhile, there was no such statistical tendency for Buddhism. The results lead to a number of important implications. First, this showed the dominance of Confucianism because the fact that Confucian and Taoist values appeared together in a story led to the story’s key message dominated by Confucianism. Thus, it presented the evidence of Confucian dominance and against liberal interpretations of the concept of the Common Roots of Three Religions (\"tam giao đồng nguyen\") as religious unification or unicity. Second, the concept of \"cultural additivity\" could help explain many interesting socio-cultural phenomena, namely the absence of religious intolerance and extremism in the Vietnamese society, outrageous cases of sophistry in education, the low productivity in creative endeavors like science and technology, the misleading branding strategy in business. We are aware that our results are only preliminary and more studies, both theoretical and empirical, must be carried out to give a full account of the explanatory reach of \"cultural additivity\".","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116619818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-03-01DOI: 10.1201/9781315120805-11
B. Shepherd, Qi Liu, Valentine Wanga, C. Li
The probability-scale residual (PSR) is well defined across a wide variety of variable types and models, making it useful for studies of HIV/AIDS. In this manuscript, we highlight some of the properties of the PSR and illustrate its application with HIV data. As a residual, it can be useful for model diagnostics; we demonstrate its use with ordered categorical data and semiparametric transformation models. The PSR can also be used to construct tests of residual correlation. In fact, partial Spearman's rank correlation between $X$ and $Y$ while adjusting for covariates $Z$ can be constructed as the correlation between PSRs from models of $Y$ on $Z$ and of $X$ on $Z$. The covariance of PSRs is also useful in some settings. We apply these methods to a variety of HIV datasets including 1) a study examining risk factors for more severe forms of cervical lesions among 145 women living with HIV in Zambia, 2) a study investigating the association between 21 metabolomic biomarkers among 70 HIV-positive patients in the southeastern United States, and 3) a genome wide association study investigating the association between single nucleotide polymorphisms and tenofovir clearance among 501 HIV-positive persons participating in a multi-site randomized clinical trial.
{"title":"Probability-Scale Residuals in HIV/AIDS Research: Diagnostics and Inference","authors":"B. Shepherd, Qi Liu, Valentine Wanga, C. Li","doi":"10.1201/9781315120805-11","DOIUrl":"https://doi.org/10.1201/9781315120805-11","url":null,"abstract":"The probability-scale residual (PSR) is well defined across a wide variety of variable types and models, making it useful for studies of HIV/AIDS. In this manuscript, we highlight some of the properties of the PSR and illustrate its application with HIV data. As a residual, it can be useful for model diagnostics; we demonstrate its use with ordered categorical data and semiparametric transformation models. The PSR can also be used to construct tests of residual correlation. In fact, partial Spearman's rank correlation between $X$ and $Y$ while adjusting for covariates $Z$ can be constructed as the correlation between PSRs from models of $Y$ on $Z$ and of $X$ on $Z$. The covariance of PSRs is also useful in some settings. We apply these methods to a variety of HIV datasets including 1) a study examining risk factors for more severe forms of cervical lesions among 145 women living with HIV in Zambia, 2) a study investigating the association between 21 metabolomic biomarkers among 70 HIV-positive patients in the southeastern United States, and 3) a genome wide association study investigating the association between single nucleotide polymorphisms and tenofovir clearance among 501 HIV-positive persons participating in a multi-site randomized clinical trial.","PeriodicalId":409996,"journal":{"name":"arXiv: Applications","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127010994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}