We consider the problem of distribution-free predictive inference, with the goal of producing predictive coverage guarantees that hold conditionally rather than marginally. Existing methods such as conformal prediction offer marginal coverage guarantees, where predictive coverage holds on average over all possible test points, but this is not sufficient for many practical applications where we would like to know that our predictions are valid for a given individual, not merely on average over a population. On the other hand, exact conditional inference guarantees are known to be impossible without imposing assumptions on the underlying distribution. In this work, we aim to explore the space in between these two and examine what types of relaxations of the conditional coverage property would alleviate some of the practical concerns with marginal coverage guarantees while still being possible to achieve in a distribution-free setting.
{"title":"The limits of distribution-free conditional predictive inference","authors":"Rina Foygel Barber;Emmanuel J Candès;Aaditya Ramdas;Ryan J Tibshirani","doi":"10.1093/imaiai/iaaa017","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa017","url":null,"abstract":"We consider the problem of distribution-free predictive inference, with the goal of producing predictive coverage guarantees that hold conditionally rather than marginally. Existing methods such as conformal prediction offer marginal coverage guarantees, where predictive coverage holds on average over all possible test points, but this is not sufficient for many practical applications where we would like to know that our predictions are valid for a given individual, not merely on average over a population. On the other hand, exact conditional inference guarantees are known to be impossible without imposing assumptions on the underlying distribution. In this work, we aim to explore the space in between these two and examine what types of relaxations of the conditional coverage property would alleviate some of the practical concerns with marginal coverage guarantees while still being possible to achieve in a distribution-free setting.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"455-482"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Through the direct study of the analysis estimator we derive oracle inequalities with fast and slow rates by adapting the arguments involving projections by Dalalyan et al. (2017, Bernoulli, 23, 552–581). We then extend the theory to the square root analysis estimator. Finally, we focus on (square root) total variation regularized estimators on graphs and obtain constant-friendly rates, which, up to log terms, match previous results obtained by entropy calculations. We also obtain an oracle inequality for the (square root) total variation regularized estimator over the cycle graph.
{"title":"Oracle inequalities for square root analysis estimators with application to total variation penalties","authors":"Francesco Ortelli;Sara van de Geer","doi":"10.1093/imaiai/iaaa002","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa002","url":null,"abstract":"Through the direct study of the analysis estimator we derive oracle inequalities with fast and slow rates by adapting the arguments involving projections by Dalalyan et al. (2017, Bernoulli, 23, 552–581). We then extend the theory to the square root analysis estimator. Finally, we focus on (square root) total variation regularized estimators on graphs and obtain constant-friendly rates, which, up to log terms, match previous results obtained by entropy calculations. We also obtain an oracle inequality for the (square root) total variation regularized estimator over the cycle graph.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"483-514"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the task of recovering a pair of vectors from a set of rank one bilinear measurements, possibly corrupted by noise. Most notably, the problem of robust blind deconvolution can be modeled in this way. We consider a natural nonsmooth formulation of the rank one bilinear sensing problem and show that its moduli of weak convexity, sharpness and Lipschitz continuity are all dimension independent, under favorable statistical assumptions. This phenomenon persists even when up to half of the measurements are corrupted by noise. Consequently, standard algorithms, such as the subgradient and prox-linear methods, converge at a rapid dimension-independent rate when initialized within a constant relative error of the solution. We complete the paper with a new initialization strategy, complementing the local search algorithms. The initialization procedure is both provably efficient and robust to outlying measurements. Numerical experiments, on both simulated and real data, illustrate the developed theory and methods.
{"title":"Composite optimization for robust rank one bilinear sensing","authors":"Vasileios Charisopoulos;Damek Davis;Mateo Díaz;Dmitriy Drusvyatskiy","doi":"10.1093/imaiai/iaaa027","DOIUrl":"https://doi.org/10.1093/imaiai/iaaa027","url":null,"abstract":"We consider the task of recovering a pair of vectors from a set of rank one bilinear measurements, possibly corrupted by noise. Most notably, the problem of robust blind deconvolution can be modeled in this way. We consider a natural nonsmooth formulation of the rank one bilinear sensing problem and show that its moduli of weak convexity, sharpness and Lipschitz continuity are all dimension independent, under favorable statistical assumptions. This phenomenon persists even when up to half of the measurements are corrupted by noise. Consequently, standard algorithms, such as the subgradient and prox-linear methods, converge at a rapid dimension-independent rate when initialized within a constant relative error of the solution. We complete the paper with a new initialization strategy, complementing the local search algorithms. The initialization procedure is both provably efficient and robust to outlying measurements. Numerical experiments, on both simulated and real data, illustrate the developed theory and methods.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"10 1","pages":"333-396"},"PeriodicalIF":1.6,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaaa027","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50262610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}