Pub Date : 2018-12-01DOI: 10.13140/RG.2.2.23109.29929
Z. Qiu, Li Yu, Wenge Guo
In applications of clinical trials, tested hypotheses are often grouped as multiple hierarchically ordered families. To test such structured hypotheses, various gatekeeping strategies have been developed in the literature, such as series gatekeeping, parallel gatekeeping, tree-structured gatekeeping strategies, etc. However, these gatekeeping strategies are often either non-intuitive or less flexible when addressing increasingly complex logical relationships among families of hypotheses. In order to overcome the issue, in this paper, we develop a new family-based graphical approach, which can easily derive and visualize different gatekeeping strategies. In the proposed approach, a directed and weighted graph is used to represent the generated gatekeeping strategy where each node corresponds to a family of hypotheses and two simple updating rules are used for updating the critical value of each family and the transition coefficient between any two families. Theoretically, we show that the proposed graphical approach strongly controls the overall familywise error rate at a pre-specified level. Through some case studies and a real clinical example, we demonstrate simplicity and flexibility of the proposed approach.
{"title":"A Family-based Graphical Approach for Testing Hierarchically Ordered Families of Hypotheses","authors":"Z. Qiu, Li Yu, Wenge Guo","doi":"10.13140/RG.2.2.23109.29929","DOIUrl":"https://doi.org/10.13140/RG.2.2.23109.29929","url":null,"abstract":"In applications of clinical trials, tested hypotheses are often grouped as multiple hierarchically ordered families. To test such structured hypotheses, various gatekeeping strategies have been developed in the literature, such as series gatekeeping, parallel gatekeeping, tree-structured gatekeeping strategies, etc. However, these gatekeeping strategies are often either non-intuitive or less flexible when addressing increasingly complex logical relationships among families of hypotheses. In order to overcome the issue, in this paper, we develop a new family-based graphical approach, which can easily derive and visualize different gatekeeping strategies. In the proposed approach, a directed and weighted graph is used to represent the generated gatekeeping strategy where each node corresponds to a family of hypotheses and two simple updating rules are used for updating the critical value of each family and the transition coefficient between any two families. Theoretically, we show that the proposed graphical approach strongly controls the overall familywise error rate at a pre-specified level. Through some case studies and a real clinical example, we demonstrate simplicity and flexibility of the proposed approach.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114572865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In microbiome and genomic study, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the classic log-contrast model is often used where read counts are normalized into compositions. However, zero read counts and the randomness in covariates remain critical issues. In this article, we introduce a surprisingly simple, interpretable, and efficient method for the estimation of compositional data regression through the lens of a novel high-dimensional log-error-in-variable regression model. The proposed method provides both corrections on sequencing data with possible overdispersion and simultaneously avoids any subjective imputation of zero read counts. We provide theoretical justifications with matching upper and lower bounds for the estimation error. We also consider a general log-error-in-variable regression model with corresponding estimation method to accommodate broader situations. The merit of the procedure is illustrated through real data analysis and simulation studies.
{"title":"High-dimensional Log-Error-in-Variable Regression with Applications to Microbial Compositional Data Analysis","authors":"Pixu Shi, Yuchen Zhou, Anru R. Zhang","doi":"10.1093/BIOMET/ASAB020","DOIUrl":"https://doi.org/10.1093/BIOMET/ASAB020","url":null,"abstract":"In microbiome and genomic study, the regression of compositional data has been a crucial tool for identifying microbial taxa or genes that are associated with clinical phenotypes. To account for the variation in sequencing depth, the classic log-contrast model is often used where read counts are normalized into compositions. However, zero read counts and the randomness in covariates remain critical issues. \u0000In this article, we introduce a surprisingly simple, interpretable, and efficient method for the estimation of compositional data regression through the lens of a novel high-dimensional log-error-in-variable regression model. The proposed method provides both corrections on sequencing data with possible overdispersion and simultaneously avoids any subjective imputation of zero read counts. We provide theoretical justifications with matching upper and lower bounds for the estimation error. We also consider a general log-error-in-variable regression model with corresponding estimation method to accommodate broader situations. The merit of the procedure is illustrated through real data analysis and simulation studies.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128965402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-13DOI: 10.3929/ETHZ-B-000459190
Domagoj Cevid, Peter Buhlmann, N. Meinshausen
Standard high-dimensional regression methods assume that the underlying coefficient vector is sparse. This might not be true in some cases, in particular in presence of hidden, confounding variables. Such hidden confounding can be represented as a high-dimensional linear model where the sparse coefficient vector is perturbed. For this model, we develop and investigate a class of methods that are based on running the Lasso on preprocessed data. The preprocessing step consists of applying certain spectral transformations that change the singular values of the design matrix. We show that, under some assumptions, one can achieve the optimal $ell_1$-error rate for estimating the underlying sparse coefficient vector. Our theory also covers the Lava estimator (Chernozhukov et al. [2017]) for a special model class. The performance of the method is illustrated on simulated data and a genomic dataset.
{"title":"Spectral Deconfounding via Perturbed Sparse Linear Models","authors":"Domagoj Cevid, Peter Buhlmann, N. Meinshausen","doi":"10.3929/ETHZ-B-000459190","DOIUrl":"https://doi.org/10.3929/ETHZ-B-000459190","url":null,"abstract":"Standard high-dimensional regression methods assume that the underlying coefficient vector is sparse. This might not be true in some cases, in particular in presence of hidden, confounding variables. Such hidden confounding can be represented as a high-dimensional linear model where the sparse coefficient vector is perturbed. For this model, we develop and investigate a class of methods that are based on running the Lasso on preprocessed data. The preprocessing step consists of applying certain spectral transformations that change the singular values of the design matrix. We show that, under some assumptions, one can achieve the optimal $ell_1$-error rate for estimating the underlying sparse coefficient vector. Our theory also covers the Lava estimator (Chernozhukov et al. [2017]) for a special model class. The performance of the method is illustrated on simulated data and a genomic dataset.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"292 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123043105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mixtures of experts models provide a framework in which covariates may be included in mixture models. This is achieved by modelling the parameters of the mixture model as functions of the concomitant covariates. Given their mixture model foundation, mixtures of experts models possess a diverse range of analytic uses, from clustering observations to capturing parameter heterogeneity in cross-sectional data. This chapter focuses on delineating the mixture of experts modelling framework and demonstrates the utility and flexibility of mixtures of experts models as an analytic tool.
{"title":"Handbook of Mixture Analysis","authors":"I. C. Gormley, Sylvia Frühwirth-Schnatter","doi":"10.1201/9780429055911","DOIUrl":"https://doi.org/10.1201/9780429055911","url":null,"abstract":"Mixtures of experts models provide a framework in which covariates may be included in mixture models. This is achieved by modelling the parameters of the mixture model as functions of the concomitant covariates. Given their mixture model foundation, mixtures of experts models possess a diverse range of analytic uses, from clustering observations to capturing parameter heterogeneity in cross-sectional data. This chapter focuses on delineating the mixture of experts modelling framework and demonstrates the utility and flexibility of mixtures of experts models as an analytic tool.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123047435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider exact algorithms for Bayesian inference with model selection priors (including spike-and-slab priors) in the sparse normal sequence model. Because the best existing exact algorithm becomes numerically unstable for sample sizes over n=500, there has been much attention for alternative approaches like approximate algorithms (Gibbs sampling, variational Bayes, etc.), shrinkage priors (e.g. the Horseshoe prior and the Spike-and-Slab LASSO) or empirical Bayesian methods. However, by introducing algorithmic ideas from online sequential prediction, we show that exact calculations are feasible for much larger sample sizes: for general model selection priors we reach n=25000, and for certain spike-and-slab priors we can easily reach n=100000. We further prove a de Finetti-like result for finite sample sizes that characterizes exactly which model selection priors can be expressed as spike-and-slab priors. Finally, the computational speed and numerical accuracy of the proposed methods are demonstrated in experiments on simulated data and on a prostate cancer data set. In our experimental evaluation we compute guaranteed bounds on the numerical accuracy of all new algorithms, which shows that the proposed methods are numerically reliable whereas an alternative based on long division is not.
{"title":"Fast Exact Bayesian Inference for Sparse Signals in the Normal Sequence Model.","authors":"T. Erven, Botond Szabó","doi":"10.1214/20-ba1227","DOIUrl":"https://doi.org/10.1214/20-ba1227","url":null,"abstract":"We consider exact algorithms for Bayesian inference with model selection priors (including spike-and-slab priors) in the sparse normal sequence model. Because the best existing exact algorithm becomes numerically unstable for sample sizes over n=500, there has been much attention for alternative approaches like approximate algorithms (Gibbs sampling, variational Bayes, etc.), shrinkage priors (e.g. the Horseshoe prior and the Spike-and-Slab LASSO) or empirical Bayesian methods. However, by introducing algorithmic ideas from online sequential prediction, we show that exact calculations are feasible for much larger sample sizes: for general model selection priors we reach n=25000, and for certain spike-and-slab priors we can easily reach n=100000. We further prove a de Finetti-like result for finite sample sizes that characterizes exactly which model selection priors can be expressed as spike-and-slab priors. Finally, the computational speed and numerical accuracy of the proposed methods are demonstrated in experiments on simulated data and on a prostate cancer data set. In our experimental evaluation we compute guaranteed bounds on the numerical accuracy of all new algorithms, which shows that the proposed methods are numerically reliable whereas an alternative based on long division is not.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126742370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This concise introduction provides an entry point to the world of inverse problems and data assimilation for advanced undergraduates and beginning graduate students in the mathematical sciences. It will also appeal to researchers in science and engineering who are interested in the systematic underpinnings of methodologies widely used in their disciplines. The authors examine inverse problems and data assimilation in turn, before exploring the use of data assimilation methods to solve generic inverse problems by introducing an artificial algorithmic time. Topics covered include maximum a posteriori estimation, (stochastic) gradient descent, variational Bayes, Monte Carlo, importance sampling and Markov chain Monte Carlo for inverse problems; and 3DVAR, 4DVAR, extended and ensemble Kalman filters, and particle filters for data assimilation. The book contains a wealth of examples and exercises, and can be used to accompany courses as well as for self-study.
{"title":"Inverse Problems and Data Assimilation","authors":"D. Sanz-Alonso, A. Stuart, Armeen Taeb","doi":"10.1017/9781009414319","DOIUrl":"https://doi.org/10.1017/9781009414319","url":null,"abstract":"This concise introduction provides an entry point to the world of inverse problems and data assimilation for advanced undergraduates and beginning graduate students in the mathematical sciences. It will also appeal to researchers in science and engineering who are interested in the systematic underpinnings of methodologies widely used in their disciplines. The authors examine inverse problems and data assimilation in turn, before exploring the use of data assimilation methods to solve generic inverse problems by introducing an artificial algorithmic time. Topics covered include maximum a posteriori estimation, (stochastic) gradient descent, variational Bayes, Monte Carlo, importance sampling and Markov chain Monte Carlo for inverse problems; and 3DVAR, 4DVAR, extended and ensemble Kalman filters, and particle filters for data assimilation. The book contains a wealth of examples and exercises, and can be used to accompany courses as well as for self-study.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131612099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Divergence functions play a key role as to measure the discrepancy between two points in the field of machine learning, statistics and signal processing. Well-known divergences are the Bregman divergences, the Jensen divergences and the f-divergences.In this paper, we show that the symmetric Bregman divergence can be decomposed into the sum of two types of Jensen divergences and the Bregman divergence.Furthermore, applying this result, we show another sum decomposition of divergence is possible which includes f-divergences explicitly.
{"title":"Sum decomposition of divergence into three divergences","authors":"T. Nishiyama","doi":"10.31219/osf.io/dvcbt","DOIUrl":"https://doi.org/10.31219/osf.io/dvcbt","url":null,"abstract":"Divergence functions play a key role as to measure the discrepancy between two points in the field of machine learning, statistics and signal processing. Well-known divergences are the Bregman divergences, the Jensen divergences and the f-divergences.In this paper, we show that the symmetric Bregman divergence can be decomposed into the sum of two types of Jensen divergences and the Bregman divergence.Furthermore, applying this result, we show another sum decomposition of divergence is possible which includes f-divergences explicitly.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"349 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128954640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suppose one is interested in estimating causal effects in the presence of potentially unmeasured confounding with the aid of a valid instrumental variable. This paper investigates the problem of making inferences about the average treatment effect when data are fused from two separate sources, one of which contains information on the treatment and the other contains information on the outcome, while values for the instrument and a vector of baseline covariates are recorded in both. We provide a general set of sufficient conditions under which the average treatment effect is nonparametrically identified from the observed data law induced by data fusion, even when the data are from two heterogeneous populations, and derive the efficiency bound for estimating this causal parameter. For inference, we develop both parametric and semiparametric methods, including a multiply robust and locally efficient estimator that is consistent even under partial misspecification of the observed data model. We illustrate the methods through simulations and an application on public housing projects.
{"title":"On Semiparametric Instrumental Variable Estimation of Average Treatment Effects through Data Fusion","authors":"Baoluo Sun, Wang Miao","doi":"10.5705/ss.202020.0081","DOIUrl":"https://doi.org/10.5705/ss.202020.0081","url":null,"abstract":"Suppose one is interested in estimating causal effects in the presence of potentially unmeasured confounding with the aid of a valid instrumental variable. This paper investigates the problem of making inferences about the average treatment effect when data are fused from two separate sources, one of which contains information on the treatment and the other contains information on the outcome, while values for the instrument and a vector of baseline covariates are recorded in both. We provide a general set of sufficient conditions under which the average treatment effect is nonparametrically identified from the observed data law induced by data fusion, even when the data are from two heterogeneous populations, and derive the efficiency bound for estimating this causal parameter. For inference, we develop both parametric and semiparametric methods, including a multiply robust and locally efficient estimator that is consistent even under partial misspecification of the observed data model. We illustrate the methods through simulations and an application on public housing projects.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133666555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marija Cupari'c, Bojana Milovsevi'c, Marko Obradovi'c
We introduce new consistent and scale-free goodness-of-fit tests for the exponential distribution based on Puri-Rubin characterization. For the construction of test statistics we employ weighted $L^2$ distance between $V$-empirical Laplace transforms of random variables that appear in the characterization. The resulting test statistics are degenerate V-statistics with estimated parameters. We compare our tests, in terms of the Bahadur efficiency, to the likelihood ratio test, as well as some recent characterization based goodness-of-fit tests for the exponential distribution. We also compare the powers of our tests to the powers of some recent and classical exponentiality tests. In both criteria, our tests are shown to be strong and outperform most of their competitors.
{"title":"New $L^2$-type exponentiality tests.","authors":"Marija Cupari'c, Bojana Milovsevi'c, Marko Obradovi'c","doi":"10.2436/20.8080.02.78","DOIUrl":"https://doi.org/10.2436/20.8080.02.78","url":null,"abstract":"We introduce new consistent and scale-free goodness-of-fit tests for the exponential distribution based on Puri-Rubin characterization. For the construction of test statistics we employ weighted $L^2$ distance between $V$-empirical Laplace transforms of random variables that appear in the characterization. The resulting test statistics are degenerate V-statistics with estimated parameters. We compare our tests, in terms of the Bahadur efficiency, to the likelihood ratio test, as well as some recent characterization based goodness-of-fit tests for the exponential distribution. We also compare the powers of our tests to the powers of some recent and classical exponentiality tests. In both criteria, our tests are shown to be strong and outperform most of their competitors.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126081050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many studies have been conducted on flows of probability measures, often in terms of gradient flows. We introduce here a novel approach for the modeling of the instantaneous evolution of empirically observed distribution flows over time with a data-analytic focus that has not yet been explored. The proposed model describes the observed flow of distributions on one-dimensional Euclidean space $mathbb{R}$ over time based on the Wasserstein distance, utilizing derivatives of optimal transport maps over time. The resulting time dynamics of optimal transport maps are illustrated with time-varying distribution data that include yearly income distributions, the evolution of mortality over calendar years, and data on age-dependent height distributions of children from the longitudinal Z"urich growth study.
{"title":"Wasserstein Gradients for the Temporal Evolution of Probability Distributions","authors":"Yaqing Chen, H. Muller","doi":"10.1214/21-EJS1883","DOIUrl":"https://doi.org/10.1214/21-EJS1883","url":null,"abstract":"Many studies have been conducted on flows of probability measures, often in terms of gradient flows. We introduce here a novel approach for the modeling of the instantaneous evolution of empirically observed distribution flows over time with a data-analytic focus that has not yet been explored. The proposed model describes the observed flow of distributions on one-dimensional Euclidean space $mathbb{R}$ over time based on the Wasserstein distance, utilizing derivatives of optimal transport maps over time. The resulting time dynamics of optimal transport maps are illustrated with time-varying distribution data that include yearly income distributions, the evolution of mortality over calendar years, and data on age-dependent height distributions of children from the longitudinal Z\"urich growth study.","PeriodicalId":186390,"journal":{"name":"arXiv: Methodology","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125274080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}