Pub Date : 2022-03-28DOI: 10.48550/arXiv.2203.14860
G. Huguet, Alexander Tong, Bastian Alexander Rieck, Je-chun Huang, Manik Kuchroo, M. Hirn, Guy Wolf, Smita Krishnaswamy
Diffusion condensation is a dynamic process that yields a sequence of multiscale data representations that aim to encode meaningful abstractions. It has proven effective for manifold learning, denoising, clustering, and visualization of high-dimensional data. Diffusion condensation is constructed as a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data. We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives. From a geometric perspective, we obtain convergence bounds based on the smallest transition probability and the radius of the data, whereas from a spectral perspective, our bounds are based on the eigenspectrum of the diffusion kernel. Our spectral results are of particular interest since most of the literature on data diffusion is focused on homogeneous processes. From a topological perspective, we show diffusion condensation generalizes centroid-based hierarchical clustering. We use this perspective to obtain a bound based on the number of data points, independent of their location. To understand the evolution of the data geometry beyond convergence, we use topological data analysis. We show that the condensation process itself defines an intrinsic condensation homology. We use this intrinsic topology as well as the ambient persistent homology of the condensation process to study how the data changes over diffusion time. We demonstrate both types of topological information in well-understood toy examples. Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.
{"title":"Time-inhomogeneous diffusion geometry and topology","authors":"G. Huguet, Alexander Tong, Bastian Alexander Rieck, Je-chun Huang, Manik Kuchroo, M. Hirn, Guy Wolf, Smita Krishnaswamy","doi":"10.48550/arXiv.2203.14860","DOIUrl":"https://doi.org/10.48550/arXiv.2203.14860","url":null,"abstract":"Diffusion condensation is a dynamic process that yields a sequence of multiscale data representations that aim to encode meaningful abstractions. It has proven effective for manifold learning, denoising, clustering, and visualization of high-dimensional data. Diffusion condensation is constructed as a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data. We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives. From a geometric perspective, we obtain convergence bounds based on the smallest transition probability and the radius of the data, whereas from a spectral perspective, our bounds are based on the eigenspectrum of the diffusion kernel. Our spectral results are of particular interest since most of the literature on data diffusion is focused on homogeneous processes. From a topological perspective, we show diffusion condensation generalizes centroid-based hierarchical clustering. We use this perspective to obtain a bound based on the number of data points, independent of their location. To understand the evolution of the data geometry beyond convergence, we use topological data analysis. We show that the condensation process itself defines an intrinsic condensation homology. We use this intrinsic topology as well as the ambient persistent homology of the condensation process to study how the data changes over diffusion time. We demonstrate both types of topological information in well-understood toy examples. Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"5 1","pages":"346-372"},"PeriodicalIF":0.0,"publicationDate":"2022-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75542675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper considers a class of nonsmooth nonconvex-nonconcave min-max problems in machine learning and games. We first provide sufficient conditions for the existence of global minimax points and local minimax points. Next, we establish the first-order and second-order optimality conditions for local minimax points by using directional derivatives. These conditions reduce to smooth min-max problems with Fr{'e}chet derivatives. We apply our theoretical results to generative adversarial networks (GANs) in which two neural networks contest with each other in a game. Examples are used to illustrate applications of the new theory for training GANs.
{"title":"Optimality Conditions for Nonsmooth Nonconvex-Nonconcave Min-Max Problems and Generative Adversarial Networks","authors":"Jie Jiang, Xiaojun Chen","doi":"10.1137/22m1482238","DOIUrl":"https://doi.org/10.1137/22m1482238","url":null,"abstract":"This paper considers a class of nonsmooth nonconvex-nonconcave min-max problems in machine learning and games. We first provide sufficient conditions for the existence of global minimax points and local minimax points. Next, we establish the first-order and second-order optimality conditions for local minimax points by using directional derivatives. These conditions reduce to smooth min-max problems with Fr{'e}chet derivatives. We apply our theoretical results to generative adversarial networks (GANs) in which two neural networks contest with each other in a game. Examples are used to illustrate applications of the new theory for training GANs.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44793245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study a Q learning algorithm for continuous time stochastic control problems. The proposed algorithm uses the sampled state process by discretizing the state and control action spaces under piece-wise constant control processes. We show that the algorithm converges to the optimality equation of a finite Markov decision process (MDP). Using this MDP model, we provide an upper bound for the approximation error for the optimal value function of the continuous time control problem. Furthermore, we present provable upper-bounds for the performance loss of the learned control process compared to the optimal admissible control process of the original problem. The provided error upper-bounds are functions of the time and space discretization parameters, and they reveal the effect of different levels of the approximation: (i) approximation of the continuous time control problem by an MDP, (ii) use of piece-wise constant control processes, (iii) space discretization. Finally, we state a time complexity bound for the proposed algorithm as a function of the time and space discretization parameters.
{"title":"Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality","authors":"Erhan Bayraktar, A. D. Kara","doi":"10.1137/22m1484201","DOIUrl":"https://doi.org/10.1137/22m1484201","url":null,"abstract":"We study a Q learning algorithm for continuous time stochastic control problems. The proposed algorithm uses the sampled state process by discretizing the state and control action spaces under piece-wise constant control processes. We show that the algorithm converges to the optimality equation of a finite Markov decision process (MDP). Using this MDP model, we provide an upper bound for the approximation error for the optimal value function of the continuous time control problem. Furthermore, we present provable upper-bounds for the performance loss of the learned control process compared to the optimal admissible control process of the original problem. The provided error upper-bounds are functions of the time and space discretization parameters, and they reveal the effect of different levels of the approximation: (i) approximation of the continuous time control problem by an MDP, (ii) use of piece-wise constant control processes, (iii) space discretization. Finally, we state a time complexity bound for the proposed algorithm as a function of the time and space discretization parameters.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"91 1","pages":"615-638"},"PeriodicalIF":0.0,"publicationDate":"2022-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74206748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper discusses the efficiency of Hybrid Primal-Dual (HPD) type algorithms to approximate solve discrete Optimal Transport (OT) and Wasserstein Barycenter (WB) problems, with and without entropic regularization. Our first contribution is an analysis showing that these methods yield state-of-the-art convergence rates, both theoretically and practically. Next, we extend the HPD algorithm with linesearch proposed by Malitsky and Pock in 2018 to the setting where the dual space has a Bregman divergence, and the dual function is relatively strongly convex to the Bregman's kernel. This extension yields a new method for OT and WB problems based on smoothing of the objective that also achieves state-of-the-art convergence rates. Finally, we introduce a new Bregman divergence based on a scaled entropy function that makes the algorithm numerically stable and reduces the smoothing, leading to sparse solutions of OT and WB problems. We complement our findings with numerical experiments and comparisons.
{"title":"Accelerated Bregman Primal-Dual Methods Applied to Optimal Transport and Wasserstein Barycenter Problems","authors":"A. Chambolle, Juan Pablo Contreras","doi":"10.1137/22m1481865","DOIUrl":"https://doi.org/10.1137/22m1481865","url":null,"abstract":"This paper discusses the efficiency of Hybrid Primal-Dual (HPD) type algorithms to approximate solve discrete Optimal Transport (OT) and Wasserstein Barycenter (WB) problems, with and without entropic regularization. Our first contribution is an analysis showing that these methods yield state-of-the-art convergence rates, both theoretically and practically. Next, we extend the HPD algorithm with linesearch proposed by Malitsky and Pock in 2018 to the setting where the dual space has a Bregman divergence, and the dual function is relatively strongly convex to the Bregman's kernel. This extension yields a new method for OT and WB problems based on smoothing of the objective that also achieves state-of-the-art convergence rates. Finally, we introduce a new Bregman divergence based on a scaled entropy function that makes the algorithm numerically stable and reduces the smoothing, leading to sparse solutions of OT and WB problems. We complement our findings with numerical experiments and comparisons.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48335589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Using rough path techniques, we provide a priori estimates for the output of Deep Residual Neural Networks in terms of both the input data and the (trained) network weights. As trained network weights are typically very rough when seen as functions of the layer, we propose to derive stability bounds in terms of the total $p$-variation of trained weights for any $pin[1,3]$. Unlike the $C^1$-theory underlying the neural ODE literature, our estimates remain bounded even in the limiting case of weights behaving like Brownian motions, as suggested in [arXiv:2105.12245]. Mathematically, we interpret residual neural network as solutions to (rough) difference equations, and analyse them based on recent results of discrete time signatures and rough path theory.
{"title":"Stability of Deep Neural Networks via discrete rough paths","authors":"Christian Bayer, P. Friz, N. Tapia","doi":"10.1137/22M1472358","DOIUrl":"https://doi.org/10.1137/22M1472358","url":null,"abstract":"Using rough path techniques, we provide a priori estimates for the output of Deep Residual Neural Networks in terms of both the input data and the (trained) network weights. As trained network weights are typically very rough when seen as functions of the layer, we propose to derive stability bounds in terms of the total $p$-variation of trained weights for any $pin[1,3]$. Unlike the $C^1$-theory underlying the neural ODE literature, our estimates remain bounded even in the limiting case of weights behaving like Brownian motions, as suggested in [arXiv:2105.12245]. Mathematically, we interpret residual neural network as solutions to (rough) difference equations, and analyse them based on recent results of discrete time signatures and rough path theory.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"35 1","pages":"50-76"},"PeriodicalIF":0.0,"publicationDate":"2022-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84277562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The non-convexity of the artificial neural network (ANN) training landscape brings inherent optimization difficulties. While the traditional back-propagation stochastic gradient descent (SGD) algorithm and its variants are effective in certain cases, they can become stuck at spurious local minima and are sensitive to initializations and hyperparameters. Recent work has shown that the training of an ANN with ReLU activations can be reformulated as a convex program, bringing hope to globally optimizing interpretable ANNs. However, naively solving the convex training formulation has an exponential complexity, and even an approximation heuristic requires cubic time. In this work, we characterize the quality of this approximation and develop two efficient algorithms that train ANNs with global convergence guarantees. The first algorithm is based on the alternating direction method of multiplier (ADMM). It solves both the exact convex formulation and the approximate counterpart. Linear global convergence is achieved, and the initial several iterations often yield a solution with high prediction accuracy. When solving the approximate formulation, the per-iteration time complexity is quadratic. The second algorithm, based on the"sampled convex programs"theory, is simpler to implement. It solves unconstrained convex formulations and converges to an approximately globally optimal classifier. The non-convexity of the ANN training landscape exacerbates when adversarial training is considered. We apply the robust convex optimization theory to convex training and develop convex formulations that train ANNs robust to adversarial inputs. Our analysis explicitly focuses on one-hidden-layer fully connected ANNs, but can extend to more sophisticated architectures.
{"title":"Efficient Global Optimization of Two-layer ReLU Networks: Quadratic-time Algorithms and Adversarial Training","authors":"Yatong Bai, Tanmay Gautam, S. Sojoudi","doi":"10.1137/21m1467134","DOIUrl":"https://doi.org/10.1137/21m1467134","url":null,"abstract":"The non-convexity of the artificial neural network (ANN) training landscape brings inherent optimization difficulties. While the traditional back-propagation stochastic gradient descent (SGD) algorithm and its variants are effective in certain cases, they can become stuck at spurious local minima and are sensitive to initializations and hyperparameters. Recent work has shown that the training of an ANN with ReLU activations can be reformulated as a convex program, bringing hope to globally optimizing interpretable ANNs. However, naively solving the convex training formulation has an exponential complexity, and even an approximation heuristic requires cubic time. In this work, we characterize the quality of this approximation and develop two efficient algorithms that train ANNs with global convergence guarantees. The first algorithm is based on the alternating direction method of multiplier (ADMM). It solves both the exact convex formulation and the approximate counterpart. Linear global convergence is achieved, and the initial several iterations often yield a solution with high prediction accuracy. When solving the approximate formulation, the per-iteration time complexity is quadratic. The second algorithm, based on the\"sampled convex programs\"theory, is simpler to implement. It solves unconstrained convex formulations and converges to an approximately globally optimal classifier. The non-convexity of the ANN training landscape exacerbates when adversarial training is considered. We apply the robust convex optimization theory to convex training and develop convex formulations that train ANNs robust to adversarial inputs. Our analysis explicitly focuses on one-hidden-layer fully connected ANNs, but can extend to more sophisticated architectures.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64315036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus
{"title":"Sensitivity-Informed Provable Pruning of Neural Networks","authors":"Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus","doi":"10.1137/20m1383239","DOIUrl":"https://doi.org/10.1137/20m1383239","url":null,"abstract":"","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"15 1","pages":"26-45"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86155867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1007/978-3-031-06664-1
H. Schenck
{"title":"Algebraic Foundations for Applied Topology and Data Analysis","authors":"H. Schenck","doi":"10.1007/978-3-031-06664-1","DOIUrl":"https://doi.org/10.1007/978-3-031-06664-1","url":null,"abstract":"","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85299807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Estimating the rank of a corrupted data matrix is an important task in data analysis, most notably for choosing the number of components in PCA. Significant progress on this task was achieved using random matrix theory by characterizing the spectral properties of large noise matrices. However, utilizing such tools is not straightforward when the data matrix consists of count random variables, e.g., Poisson, in which case the noise can be heteroskedastic with an unknown variance in each entry. In this work, we focus on a Poisson random matrix with independent entries and propose a simple procedure, termed biwhitening, for estimating the rank of the underlying signal matrix (i.e., the Poisson parameter matrix) without any prior knowledge. Our approach is based on the key observation that one can scale the rows and columns of the data matrix simultaneously so that the spectrum of the corresponding noise agrees with the standard Marchenko-Pastur (MP) law, justifying the use of the MP upper edge as a threshold for rank selection. Importantly, the required scaling factors can be estimated directly from the observations by solving a matrix scaling problem via the Sinkhorn-Knopp algorithm. Aside from the Poisson, our approach is extended to families of distributions that satisfy a quadratic relation between the mean and the variance, such as the generalized Poisson, binomial, negative binomial, gamma, and many others. This quadratic relation can also account for missing entries in the data. We conduct numerical experiments that corroborate our theoretical findings, and showcase the advantage of our approach for rank estimation in challenging regimes. Furthermore, we demonstrate the favorable performance of our approach on several real datasets of single-cell RNA sequencing (scRNA-seq), High-Throughput Chromosome Conformation Capture (Hi-C), and document topic modeling.
{"title":"Biwhitening Reveals the Rank of a Count Matrix.","authors":"Boris Landa, Thomas T C K Zhang, Yuval Kluger","doi":"10.1137/21m1456807","DOIUrl":"https://doi.org/10.1137/21m1456807","url":null,"abstract":"<p><p>Estimating the rank of a corrupted data matrix is an important task in data analysis, most notably for choosing the number of components in PCA. Significant progress on this task was achieved using random matrix theory by characterizing the spectral properties of large noise matrices. However, utilizing such tools is not straightforward when the data matrix consists of count random variables, e.g., Poisson, in which case the noise can be heteroskedastic with an unknown variance in each entry. In this work, we focus on a Poisson random matrix with independent entries and propose a simple procedure, termed <i>biwhitening</i>, for estimating the rank of the underlying signal matrix (i.e., the Poisson parameter matrix) without any prior knowledge. Our approach is based on the key observation that one can scale the rows and columns of the data matrix simultaneously so that the spectrum of the corresponding noise agrees with the standard Marchenko-Pastur (MP) law, justifying the use of the MP upper edge as a threshold for rank selection. Importantly, the required scaling factors can be estimated directly from the observations by solving a matrix scaling problem via the Sinkhorn-Knopp algorithm. Aside from the Poisson, our approach is extended to families of distributions that satisfy a quadratic relation between the mean and the variance, such as the generalized Poisson, binomial, negative binomial, gamma, and many others. This quadratic relation can also account for missing entries in the data. We conduct numerical experiments that corroborate our theoretical findings, and showcase the advantage of our approach for rank estimation in challenging regimes. Furthermore, we demonstrate the favorable performance of our approach on several real datasets of single-cell RNA sequencing (scRNA-seq), High-Throughput Chromosome Conformation Capture (Hi-C), and document topic modeling.</p>","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":"4 4","pages":"1420-1446"},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10417917/pdf/nihms-1888877.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10006236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In multi-agent reinforcement learning (MARL), independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For $epsilon geq 0$, an $epsilon$-satisficing policy update rule is any rule that instructs the agent to not change its policy when it is $epsilon$-best-responding to the policies of the remaining players; $epsilon$-satisficing paths are defined to be sequences of joint policies obtained when each agent uses some $epsilon$-satisficing policy update rule to select its next policy. We establish structural results on the existence of $epsilon$-satisficing paths into $epsilon$-equilibrium in both symmetric $N$-player games and general stochastic games with two players. We then present an independent learning algorithm for $N$-player symmetric games and give high probability guarantees of convergence to $epsilon$-equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of $epsilon$-satisficing paths.
{"title":"Satisficing Paths and Independent Multiagent Reinforcement Learning in Stochastic Games","authors":"Bora Yongacoglu, Gürdal Arslan, S. Yuksel","doi":"10.1137/22m1515112","DOIUrl":"https://doi.org/10.1137/22m1515112","url":null,"abstract":"In multi-agent reinforcement learning (MARL), independent learners are those that do not observe the actions of other agents in the system. Due to the decentralization of information, it is challenging to design independent learners that drive play to equilibrium. This paper investigates the feasibility of using satisficing dynamics to guide independent learners to approximate equilibrium in stochastic games. For $epsilon geq 0$, an $epsilon$-satisficing policy update rule is any rule that instructs the agent to not change its policy when it is $epsilon$-best-responding to the policies of the remaining players; $epsilon$-satisficing paths are defined to be sequences of joint policies obtained when each agent uses some $epsilon$-satisficing policy update rule to select its next policy. We establish structural results on the existence of $epsilon$-satisficing paths into $epsilon$-equilibrium in both symmetric $N$-player games and general stochastic games with two players. We then present an independent learning algorithm for $N$-player symmetric games and give high probability guarantees of convergence to $epsilon$-equilibrium under self-play. This guarantee is made using symmetry alone, leveraging the previously unexploited structure of $epsilon$-satisficing paths.","PeriodicalId":74797,"journal":{"name":"SIAM journal on mathematics of data science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48563766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}