{"title":"The Local Landscape of Phase Retrieval Under Limited Samples","authors":"Kaizhao Liu;Zihao Wang;Lei Wu","doi":"10.1109/TIT.2024.3481269","DOIUrl":null,"url":null,"abstract":"We present a fine-grained analysis of the local landscape of phase retrieval under the regime of limited samples. Specifically, we aim to ascertain the minimal sample size required to guarantee a benign local landscape surrounding global minima in high dimensions. Let n and d denote the sample size and input dimension, respectively. We first explore the local convexity and establish that when \n<inline-formula> <tex-math>$n=o(d\\log d)$ </tex-math></inline-formula>\n, for almost every fixed point in the local ball, the Hessian matrix has negative eigenvalues, provided d is sufficiently large. We next consider the one-point convexity and show that, as long as \n<inline-formula> <tex-math>$n=\\omega (d)$ </tex-math></inline-formula>\n, with high probability, the landscape is one-point strongly convex in the local annulus: \n<inline-formula> <tex-math>$\\{w\\in \\mathbb {R}^{d}: o_{d}({1})\\leqslant \\|w-w^{*}\\|\\leqslant c\\}$ </tex-math></inline-formula>\n, where \n<inline-formula> <tex-math>$w^{*}$ </tex-math></inline-formula>\n is the ground truth and c is an absolute constant. This implies that gradient descent, initialized from any point in this domain, can converge to an \n<inline-formula> <tex-math>$o_{d}({1})$ </tex-math></inline-formula>\n-loss solution exponentially fast. Furthermore, we show that when \n<inline-formula> <tex-math>$n=o(d\\log d)$ </tex-math></inline-formula>\n, there is a radius of \n<inline-formula> <tex-math>$\\widetilde {\\Theta } \\left ({{\\sqrt {1/d}}}\\right)$ </tex-math></inline-formula>\n such that one-point convexity breaks down in the corresponding smaller local ball. This indicates an impossibility to establish a convergence to the exact \n<inline-formula> <tex-math>$w^{*}$ </tex-math></inline-formula>\n for gradient descent under limited samples by relying solely on one-point convexity.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"70 12","pages":"9012-9035"},"PeriodicalIF":2.2000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10718309/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
We present a fine-grained analysis of the local landscape of phase retrieval under the regime of limited samples. Specifically, we aim to ascertain the minimal sample size required to guarantee a benign local landscape surrounding global minima in high dimensions. Let n and d denote the sample size and input dimension, respectively. We first explore the local convexity and establish that when
$n=o(d\log d)$
, for almost every fixed point in the local ball, the Hessian matrix has negative eigenvalues, provided d is sufficiently large. We next consider the one-point convexity and show that, as long as
$n=\omega (d)$
, with high probability, the landscape is one-point strongly convex in the local annulus:
$\{w\in \mathbb {R}^{d}: o_{d}({1})\leqslant \|w-w^{*}\|\leqslant c\}$
, where
$w^{*}$
is the ground truth and c is an absolute constant. This implies that gradient descent, initialized from any point in this domain, can converge to an
$o_{d}({1})$
-loss solution exponentially fast. Furthermore, we show that when
$n=o(d\log d)$
, there is a radius of
$\widetilde {\Theta } \left ({{\sqrt {1/d}}}\right)$
such that one-point convexity breaks down in the corresponding smaller local ball. This indicates an impossibility to establish a convergence to the exact
$w^{*}$
for gradient descent under limited samples by relying solely on one-point convexity.
期刊介绍:
The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.