We study projection determinantal point processes and their connection to the squared Grassmannian. We prove that the log-likelihood function of this statistical model has $(n - 1)!/2$ critical points, all of which are real and positive, thereby settling a conjecture of Devriendt, Friedman, Reinke, and Sturmfels.
{"title":"Likelihood Geometry of the Squared Grassmannian","authors":"Hannah Friedman","doi":"arxiv-2409.03730","DOIUrl":"https://doi.org/arxiv-2409.03730","url":null,"abstract":"We study projection determinantal point processes and their connection to the\u0000squared Grassmannian. We prove that the log-likelihood function of this\u0000statistical model has $(n - 1)!/2$ critical points, all of which are real and\u0000positive, thereby settling a conjecture of Devriendt, Friedman, Reinke, and\u0000Sturmfels.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"35 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We obtain the upper error bounds of robust estimators for mean vector, using the median-of-means (MOM) method. The method is designed to handle data with heavy tails and contamination, with only a finite second moment, which is weaker than many others, relying on the VC dimension rather than the Rademacher complexity to measure statistical complexity. This allows us to implement MOM in covariance estimation, without imposing conditions such as $L$-sub-Gaussian or $L_{4}-L_{2}$ norm equivalence. In particular, we derive a new robust estimator, the MOM version of the halfspace depth, along with error bounds for mean estimation in any norm.
{"title":"Error bounds of Median-of-means estimators with VC-dimension","authors":"Yuxuan Wang, Yiming Chen, Hanchao Wang, Lixin Zhang","doi":"arxiv-2409.03410","DOIUrl":"https://doi.org/arxiv-2409.03410","url":null,"abstract":"We obtain the upper error bounds of robust estimators for mean vector, using\u0000the median-of-means (MOM) method. The method is designed to handle data with\u0000heavy tails and contamination, with only a finite second moment, which is\u0000weaker than many others, relying on the VC dimension rather than the Rademacher\u0000complexity to measure statistical complexity. This allows us to implement MOM\u0000in covariance estimation, without imposing conditions such as $L$-sub-Gaussian\u0000or $L_{4}-L_{2}$ norm equivalence. In particular, we derive a new robust\u0000estimator, the MOM version of the halfspace depth, along with error bounds for\u0000mean estimation in any norm.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jasper Marijn Everink, Yiqiu Dong, Martin Skovgaard Andersen
In this work, we study the well-posedness of certain sparse regularized linear regression problems, i.e., the existence, uniqueness and continuity of the solution map with respect to the data. We focus on regularization functions that are convex piecewise linear, i.e., whose epigraph is polyhedral. This includes total variation on graphs and polyhedral constraints. We provide a geometric framework for these functions based on their connection to polyhedral sets and apply this to the study of the well-posedness of the corresponding sparse regularized linear regression problem. Particularly, we provide geometric conditions for well-posedness of the regression problem, compare these conditions to those for smooth regularization, and show the computational difficulty of verifying these conditions.
{"title":"The Geometry and Well-Posedness of Sparse Regularized Linear Regression","authors":"Jasper Marijn Everink, Yiqiu Dong, Martin Skovgaard Andersen","doi":"arxiv-2409.03461","DOIUrl":"https://doi.org/arxiv-2409.03461","url":null,"abstract":"In this work, we study the well-posedness of certain sparse regularized\u0000linear regression problems, i.e., the existence, uniqueness and continuity of\u0000the solution map with respect to the data. We focus on regularization functions\u0000that are convex piecewise linear, i.e., whose epigraph is polyhedral. This\u0000includes total variation on graphs and polyhedral constraints. We provide a\u0000geometric framework for these functions based on their connection to polyhedral\u0000sets and apply this to the study of the well-posedness of the corresponding\u0000sparse regularized linear regression problem. Particularly, we provide\u0000geometric conditions for well-posedness of the regression problem, compare\u0000these conditions to those for smooth regularization, and show the computational\u0000difficulty of verifying these conditions.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the statistical inverse problem of recovering a parameter $thetain H^alpha$ from data arising from the Gaussian regression problem begin{equation*} Y = mathscr{G}(theta)(Z)+varepsilon end{equation*} with nonlinear forward map $mathscr{G}:mathbb{L}^2tomathbb{L}^2$, random design points $Z$ and Gaussian noise $varepsilon$. The estimation strategy is based on a least squares approach under $VertcdotVert_{H^alpha}$-constraints. We establish the existence of a least squares estimator $hat{theta}$ as a maximizer for a given functional under Lipschitz-type assumptions on the forward map $mathscr{G}$. A general concentration result is shown, which is used to prove consistency and upper bounds for the prediction error. The corresponding rates of convergence reflect not only the smoothness of the parameter of interest but also the ill-posedness of the underlying inverse problem. We apply the general model to the Darcy problem, where the recovery of an unknown coefficient function $f$ of a PDE is of interest. For this example, we also provide corresponding rates of convergence for the prediction and estimation errors. Additionally, we briefly discuss the applicability of the general model to other problems.
{"title":"Convergence Rates for the Maximum A Posteriori Estimator in PDE-Regression Models with Random Design","authors":"Maximilian Siebel","doi":"arxiv-2409.03417","DOIUrl":"https://doi.org/arxiv-2409.03417","url":null,"abstract":"We consider the statistical inverse problem of recovering a parameter\u0000$thetain H^alpha$ from data arising from the Gaussian regression problem\u0000begin{equation*} Y = mathscr{G}(theta)(Z)+varepsilon end{equation*} with nonlinear forward\u0000map $mathscr{G}:mathbb{L}^2tomathbb{L}^2$, random design points $Z$ and\u0000Gaussian noise $varepsilon$. The estimation strategy is based on a least\u0000squares approach under $VertcdotVert_{H^alpha}$-constraints. We establish\u0000the existence of a least squares estimator $hat{theta}$ as a maximizer for a\u0000given functional under Lipschitz-type assumptions on the forward map\u0000$mathscr{G}$. A general concentration result is shown, which is used to prove\u0000consistency and upper bounds for the prediction error. The corresponding rates\u0000of convergence reflect not only the smoothness of the parameter of interest but\u0000also the ill-posedness of the underlying inverse problem. We apply the general\u0000model to the Darcy problem, where the recovery of an unknown coefficient\u0000function $f$ of a PDE is of interest. For this example, we also provide\u0000corresponding rates of convergence for the prediction and estimation errors.\u0000Additionally, we briefly discuss the applicability of the general model to\u0000other problems.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"68 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Determinantal Point Processes (DPPs), which originate from quantum and statistical physics, are known for modelling diversity. Recent research [Ghosh and Rigollet (2020)] has demonstrated that certain matrix-valued $U$-statistics (that are truncated versions of the usual sample covariance matrix) can effectively estimate parameters in the context of Gaussian DPPs and enhance dimension reduction techniques, outperforming standard methods like PCA in clustering applications. This paper explores the spectral properties of these matrix-valued $U$-statistics in the null setting of an isotropic design. These matrices may be represented as $X L X^top$, where $X$ is a data matrix and $L$ is the Laplacian matrix of a random geometric graph associated to $X$. The main mathematically interesting twist here is that the matrix $L$ is dependent on $X$. We give complete descriptions of the bulk spectra of these matrix-valued $U$-statistics in terms of the Stieltjes transforms of their empirical spectral measures. The results and the techniques are in fact able to address a broader class of kernelised random matrices, connecting their limiting spectra to generalised Marv{c}enko-Pastur laws and free probability.
{"title":"Bulk Spectra of Truncated Sample Covariance Matrices","authors":"Subhroshekhar Ghosh, Soumendu Sundar Mukherjee, Himasish Talukdar","doi":"arxiv-2409.02911","DOIUrl":"https://doi.org/arxiv-2409.02911","url":null,"abstract":"Determinantal Point Processes (DPPs), which originate from quantum and\u0000statistical physics, are known for modelling diversity. Recent research [Ghosh\u0000and Rigollet (2020)] has demonstrated that certain matrix-valued $U$-statistics\u0000(that are truncated versions of the usual sample covariance matrix) can\u0000effectively estimate parameters in the context of Gaussian DPPs and enhance\u0000dimension reduction techniques, outperforming standard methods like PCA in\u0000clustering applications. This paper explores the spectral properties of these\u0000matrix-valued $U$-statistics in the null setting of an isotropic design. These\u0000matrices may be represented as $X L X^top$, where $X$ is a data matrix and $L$\u0000is the Laplacian matrix of a random geometric graph associated to $X$. The main\u0000mathematically interesting twist here is that the matrix $L$ is dependent on\u0000$X$. We give complete descriptions of the bulk spectra of these matrix-valued\u0000$U$-statistics in terms of the Stieltjes transforms of their empirical spectral\u0000measures. The results and the techniques are in fact able to address a broader\u0000class of kernelised random matrices, connecting their limiting spectra to\u0000generalised Marv{c}enko-Pastur laws and free probability.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The phase retrieval problem in the presence of noise aims to recover the signal vector of interest from a set of quadratic measurements with infrequent but arbitrary corruptions, and it plays an important role in many scientific applications. However, the essential geometric structure of the nonconvex robust phase retrieval based on the $ell_1$-loss is largely unknown to study spurious local solutions, even under the ideal noiseless setting, and its intrinsic nonsmooth nature also impacts the efficiency of optimization algorithms. This paper introduces the smoothed robust phase retrieval (SRPR) based on a family of convolution-type smoothed loss functions. Theoretically, we prove that the SRPR enjoys a benign geometric structure with high probability: (1) under the noiseless situation, the SRPR has no spurious local solutions, and the target signals are global solutions, and (2) under the infrequent but arbitrary corruptions, we characterize the stationary points of the SRPR and prove its benign landscape, which is the first landscape analysis of phase retrieval with corruption in the literature. Moreover, we prove the local linear convergence rate of gradient descent for solving the SRPR under the noiseless situation. Experiments on both simulated datasets and image recovery are provided to demonstrate the numerical performance of the SRPR.
{"title":"Smoothed Robust Phase Retrieval","authors":"Zhong Zheng, Lingzhou Xue","doi":"arxiv-2409.01570","DOIUrl":"https://doi.org/arxiv-2409.01570","url":null,"abstract":"The phase retrieval problem in the presence of noise aims to recover the\u0000signal vector of interest from a set of quadratic measurements with infrequent\u0000but arbitrary corruptions, and it plays an important role in many scientific\u0000applications. However, the essential geometric structure of the nonconvex\u0000robust phase retrieval based on the $ell_1$-loss is largely unknown to study\u0000spurious local solutions, even under the ideal noiseless setting, and its\u0000intrinsic nonsmooth nature also impacts the efficiency of optimization\u0000algorithms. This paper introduces the smoothed robust phase retrieval (SRPR)\u0000based on a family of convolution-type smoothed loss functions. Theoretically,\u0000we prove that the SRPR enjoys a benign geometric structure with high\u0000probability: (1) under the noiseless situation, the SRPR has no spurious local\u0000solutions, and the target signals are global solutions, and (2) under the\u0000infrequent but arbitrary corruptions, we characterize the stationary points of\u0000the SRPR and prove its benign landscape, which is the first landscape analysis\u0000of phase retrieval with corruption in the literature. Moreover, we prove the\u0000local linear convergence rate of gradient descent for solving the SRPR under\u0000the noiseless situation. Experiments on both simulated datasets and image\u0000recovery are provided to demonstrate the numerical performance of the SRPR.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"82 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Have you also been wondering what is this thing with double robustness and nuisance parameters estimated at rate n^(1/4)? It turns out that to understand this phenomenon one just needs the Middle Value Theorem (or a Taylor expansion) and some smoothness conditions. This note explains why under some fairly simple conditions, as long as the nuisance parameter theta in R^k is estimated at rate n^(1/4) or faster, 1. the resulting variance of the estimator of the parameter of interest psi in R^d does not depend on how the nuisance parameter theta is estimated, and 2. the sandwich estimator of the variance of psi-hat ignoring estimation of theta is consistent.
{"title":"Demystified: double robustness with nuisance parameters estimated at rate n-to-the-1/4","authors":"Judith J. Lok","doi":"arxiv-2409.02320","DOIUrl":"https://doi.org/arxiv-2409.02320","url":null,"abstract":"Have you also been wondering what is this thing with double robustness and\u0000nuisance parameters estimated at rate n^(1/4)? It turns out that to understand\u0000this phenomenon one just needs the Middle Value Theorem (or a Taylor expansion)\u0000and some smoothness conditions. This note explains why under some fairly simple\u0000conditions, as long as the nuisance parameter theta in R^k is estimated at rate\u0000n^(1/4) or faster, 1. the resulting variance of the estimator of the parameter\u0000of interest psi in R^d does not depend on how the nuisance parameter theta is\u0000estimated, and 2. the sandwich estimator of the variance of psi-hat ignoring\u0000estimation of theta is consistent.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jérémie Capitao-Miniconi, Elisabeth Gassiat, Luc Lehéricy
Recent advances have demonstrated the possibility of solving the deconvolution problem without prior knowledge of the noise distribution. In this paper, we study the repeated measurements model, where information is derived from multiple measurements of X perturbed independently by additive errors. Our contributions include establishing identifiability without any assumption on the noise except for coordinate independence. We propose an estimator of the density of the signal for which we provide rates of convergence, and prove that it reaches the minimax rate in the case where the support of the signal is compact. Additionally, we propose a model selection procedure for adaptive estimation. Numerical simulations demonstrate the effectiveness of our approach even with limited sample sizes.
最近的研究进展证明,在不预先知道噪声分布的情况下,也有可能解决解卷积问题。在本文中,我们研究了重复测量模型,该模型中的信息来自于受到加性干扰独立扰动的 X 的多次测量。我们的贡献包括:除了坐标独立性之外,在不对噪声做任何假设的情况下建立了可识别性。我们提出了一种信号密度的估计方法,并提供了收敛率,证明在信号支持紧凑的情况下,它能达到最小收敛率。此外,我们还提出了自适应估计的模型选择程序。数值模拟证明了我们的方法即使在样本量有限的情况下也是有效的。
{"title":"Deconvolution of repeated measurements corrupted by unknown noise","authors":"Jérémie Capitao-Miniconi, Elisabeth Gassiat, Luc Lehéricy","doi":"arxiv-2409.02014","DOIUrl":"https://doi.org/arxiv-2409.02014","url":null,"abstract":"Recent advances have demonstrated the possibility of solving the\u0000deconvolution problem without prior knowledge of the noise distribution. In\u0000this paper, we study the repeated measurements model, where information is\u0000derived from multiple measurements of X perturbed independently by additive\u0000errors. Our contributions include establishing identifiability without any\u0000assumption on the noise except for coordinate independence. We propose an\u0000estimator of the density of the signal for which we provide rates of\u0000convergence, and prove that it reaches the minimax rate in the case where the\u0000support of the signal is compact. Additionally, we propose a model selection\u0000procedure for adaptive estimation. Numerical simulations demonstrate the\u0000effectiveness of our approach even with limited sample sizes.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mari Brathovde, Hein Putter, Morten Valberg, Richard A. J. Post
In the presence of unmeasured heterogeneity, the hazard ratio for exposure has a complex causal interpretation. To address this, accelerated failure time (AFT) models, which assess the effect on the survival time ratio scale, are often suggested as a better alternative. AFT models also allow for straightforward confounder adjustment. In this work, we formalize the causal interpretation of the acceleration factor in AFT models using structural causal models and data under independent censoring. We prove that the acceleration factor is a valid causal effect measure, even in the presence of frailty and treatment effect heterogeneity. Through simulations, we show that the acceleration factor better captures the causal effect than the hazard ratio when both AFT and proportional hazards models apply. Additionally, we extend the interpretation to systems with time-dependent acceleration factors, revealing the challenge of distinguishing between a time-varying homogeneous effect and unmeasured heterogeneity. While the causal interpretation of acceleration factors is promising, we caution practitioners about potential challenges in estimating these factors in the presence of effect heterogeneity.
{"title":"Formalizing the causal interpretation in accelerated failure time models with unmeasured heterogeneity","authors":"Mari Brathovde, Hein Putter, Morten Valberg, Richard A. J. Post","doi":"arxiv-2409.01983","DOIUrl":"https://doi.org/arxiv-2409.01983","url":null,"abstract":"In the presence of unmeasured heterogeneity, the hazard ratio for exposure\u0000has a complex causal interpretation. To address this, accelerated failure time\u0000(AFT) models, which assess the effect on the survival time ratio scale, are\u0000often suggested as a better alternative. AFT models also allow for\u0000straightforward confounder adjustment. In this work, we formalize the causal\u0000interpretation of the acceleration factor in AFT models using structural causal\u0000models and data under independent censoring. We prove that the acceleration\u0000factor is a valid causal effect measure, even in the presence of frailty and\u0000treatment effect heterogeneity. Through simulations, we show that the\u0000acceleration factor better captures the causal effect than the hazard ratio\u0000when both AFT and proportional hazards models apply. Additionally, we extend\u0000the interpretation to systems with time-dependent acceleration factors,\u0000revealing the challenge of distinguishing between a time-varying homogeneous\u0000effect and unmeasured heterogeneity. While the causal interpretation of\u0000acceleration factors is promising, we caution practitioners about potential\u0000challenges in estimating these factors in the presence of effect heterogeneity.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantile regression, a robust method for estimating conditional quantiles, has advanced significantly in fields such as econometrics, statistics, and machine learning. In high-dimensional settings, where the number of covariates exceeds sample size, penalized methods like lasso have been developed to address sparsity challenges. Bayesian methods, initially connected to quantile regression via the asymmetric Laplace likelihood, have also evolved, though issues with posterior variance have led to new approaches, including pseudo/score likelihoods. This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction. It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation. The method demonstrates strong theoretical guarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle inequalities, showing minimax-optimal prediction error and adaptability to unknown sparsity. Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
{"title":"A sparse PAC-Bayesian approach for high-dimensional quantile prediction","authors":"The Tien Mai","doi":"arxiv-2409.01687","DOIUrl":"https://doi.org/arxiv-2409.01687","url":null,"abstract":"Quantile regression, a robust method for estimating conditional quantiles,\u0000has advanced significantly in fields such as econometrics, statistics, and\u0000machine learning. In high-dimensional settings, where the number of covariates\u0000exceeds sample size, penalized methods like lasso have been developed to\u0000address sparsity challenges. Bayesian methods, initially connected to quantile\u0000regression via the asymmetric Laplace likelihood, have also evolved, though\u0000issues with posterior variance have led to new approaches, including\u0000pseudo/score likelihoods. This paper presents a novel probabilistic machine\u0000learning approach for high-dimensional quantile prediction. It uses a\u0000pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte\u0000Carlo for efficient computation. The method demonstrates strong theoretical\u0000guarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle\u0000inequalities, showing minimax-optimal prediction error and adaptability to\u0000unknown sparsity. Its effectiveness is validated through simulations and\u0000real-world data, where it performs competitively against established\u0000frequentist and Bayesian techniques.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}