In this paper, we explore optimal treatment allocation policies that target distributional welfare. Most literature on treatment choice has considered utilitarian welfare based on the conditional average treatment effect (ATE). While average welfare is intuitive, it may yield undesirable allocations especially when individuals are heterogeneous (e.g., with outliers) - the very reason individualized treatments were introduced in the first place. This observation motivates us to propose an optimal policy that allocates the treatment based on the conditional emph{quantile of individual treatment effects} (QoTE). Depending on the choice of the quantile probability, this criterion can accommodate a policymaker who is either prudent or negligent. The challenge of identifying the QoTE lies in its requirement for knowledge of the joint distribution of the counterfactual outcomes, which is generally hard to recover even with experimental data. Therefore, we introduce minimax optimal policies that are robust to model uncertainty. We then propose a range of identifying assumptions under which we can point or partially identify the QoTE. We establish the asymptotic bound on the regret of implementing the proposed policies. We consider both stochastic and deterministic rules. In simulations and two empirical applications, we compare optimal decisions based on the QoTE with decisions based on other criteria.
{"title":"Individualized Treatment Allocations with Distributional Welfare","authors":"Yifan Cui, Sukjin Han","doi":"arxiv-2311.15878","DOIUrl":"https://doi.org/arxiv-2311.15878","url":null,"abstract":"In this paper, we explore optimal treatment allocation policies that target\u0000distributional welfare. Most literature on treatment choice has considered\u0000utilitarian welfare based on the conditional average treatment effect (ATE).\u0000While average welfare is intuitive, it may yield undesirable allocations\u0000especially when individuals are heterogeneous (e.g., with outliers) - the very\u0000reason individualized treatments were introduced in the first place. This\u0000observation motivates us to propose an optimal policy that allocates the\u0000treatment based on the conditional emph{quantile of individual treatment\u0000effects} (QoTE). Depending on the choice of the quantile probability, this\u0000criterion can accommodate a policymaker who is either prudent or negligent. The\u0000challenge of identifying the QoTE lies in its requirement for knowledge of the\u0000joint distribution of the counterfactual outcomes, which is generally hard to\u0000recover even with experimental data. Therefore, we introduce minimax optimal\u0000policies that are robust to model uncertainty. We then propose a range of\u0000identifying assumptions under which we can point or partially identify the\u0000QoTE. We establish the asymptotic bound on the regret of implementing the\u0000proposed policies. We consider both stochastic and deterministic rules. In\u0000simulations and two empirical applications, we compare optimal decisions based\u0000on the QoTE with decisions based on other criteria.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seyong Hwang, Kyoungjae Lee, Sunmin Oh, Gunwoong Park
This study proposes the first Bayesian approach for learning high-dimensional linear Bayesian networks. The proposed approach iteratively estimates each element of the topological ordering from backward and its parent using the inverse of a partial covariance matrix. The proposed method successfully recovers the underlying structure when Bayesian regularization for the inverse covariance matrix with unequal shrinkage is applied. Specifically, it shows that the number of samples $n = Omega( d_M^2 log p)$ and $n = Omega(d_M^2 p^{2/m})$ are sufficient for the proposed algorithm to learn linear Bayesian networks with sub-Gaussian and 4m-th bounded-moment error distributions, respectively, where $p$ is the number of nodes and $d_M$ is the maximum degree of the moralized graph. The theoretical findings are supported by extensive simulation studies including real data analysis. Furthermore the proposed method is demonstrated to outperform state-of-the-art frequentist approaches, such as the BHLSM, LISTEN, and TD algorithms in synthetic data.
{"title":"Bayesian Approach to Linear Bayesian Networks","authors":"Seyong Hwang, Kyoungjae Lee, Sunmin Oh, Gunwoong Park","doi":"arxiv-2311.15610","DOIUrl":"https://doi.org/arxiv-2311.15610","url":null,"abstract":"This study proposes the first Bayesian approach for learning high-dimensional\u0000linear Bayesian networks. The proposed approach iteratively estimates each\u0000element of the topological ordering from backward and its parent using the\u0000inverse of a partial covariance matrix. The proposed method successfully\u0000recovers the underlying structure when Bayesian regularization for the inverse\u0000covariance matrix with unequal shrinkage is applied. Specifically, it shows\u0000that the number of samples $n = Omega( d_M^2 log p)$ and $n = Omega(d_M^2\u0000p^{2/m})$ are sufficient for the proposed algorithm to learn linear Bayesian\u0000networks with sub-Gaussian and 4m-th bounded-moment error distributions,\u0000respectively, where $p$ is the number of nodes and $d_M$ is the maximum degree\u0000of the moralized graph. The theoretical findings are supported by extensive\u0000simulation studies including real data analysis. Furthermore the proposed\u0000method is demonstrated to outperform state-of-the-art frequentist approaches,\u0000such as the BHLSM, LISTEN, and TD algorithms in synthetic data.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"62 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graphical and sparse (inverse) covariance models have found widespread use in modern sample-starved high dimensional applications. A part of their wide appeal stems from the significantly low sample sizes required for the existence of estimators, especially in comparison with the classical full covariance model. For undirected Gaussian graphical models, the minimum sample size required for the existence of maximum likelihood estimators had been an open question for almost half a century, and has been recently settled. The very same question for pseudo-likelihood estimators has remained unsolved ever since their introduction in the '70s. Pseudo-likelihood estimators have recently received renewed attention as they impose fewer restrictive assumptions and have better computational tractability, improved statistical performance, and appropriateness in modern high dimensional applications, thus renewing interest in this longstanding problem. In this paper, we undertake a comprehensive study of this open problem within the context of the two classes of pseudo-likelihood methods proposed in the literature. We provide a precise answer to this question for both pseudo-likelihood approaches and relate the corresponding solutions to their Gaussian counterpart.
{"title":"Pseudo-likelihood Estimators for Graphical Models: Existence and Uniqueness","authors":"Benjamin Roycraft, Bala Rajaratnam","doi":"arxiv-2311.15528","DOIUrl":"https://doi.org/arxiv-2311.15528","url":null,"abstract":"Graphical and sparse (inverse) covariance models have found widespread use in\u0000modern sample-starved high dimensional applications. A part of their wide\u0000appeal stems from the significantly low sample sizes required for the existence\u0000of estimators, especially in comparison with the classical full covariance\u0000model. For undirected Gaussian graphical models, the minimum sample size\u0000required for the existence of maximum likelihood estimators had been an open\u0000question for almost half a century, and has been recently settled. The very\u0000same question for pseudo-likelihood estimators has remained unsolved ever since\u0000their introduction in the '70s. Pseudo-likelihood estimators have recently\u0000received renewed attention as they impose fewer restrictive assumptions and\u0000have better computational tractability, improved statistical performance, and\u0000appropriateness in modern high dimensional applications, thus renewing interest\u0000in this longstanding problem. In this paper, we undertake a comprehensive study\u0000of this open problem within the context of the two classes of pseudo-likelihood\u0000methods proposed in the literature. We provide a precise answer to this\u0000question for both pseudo-likelihood approaches and relate the corresponding\u0000solutions to their Gaussian counterpart.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"34 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given two ErdH{o}s-R'enyi graphs with $n$ vertices whose edges are correlated through a latent vertex correspondence, we study complexity lower bounds for the associated correlation detection problem for the class of low-degree polynomial algorithms. We provide evidence that any degree-$O(rho^{-1})$ polynomial algorithm fails for detection, where $rho$ is the edge correlation. Furthermore, in the sparse regime where the edge density $q=n^{-1+o(1)}$, we provide evidence that any degree-$d$ polynomial algorithm fails for detection, as long as $log d=obig( frac{log n}{log nq} wedge sqrt{log n} big)$ and the correlation $rho