神经网络的熵正则化：自相似近似

IF 0.8 4区数学 Q3 STATISTICS & PROBABILITY Journal of Statistical Planning and Inference Pub Date : 2024-04-16 DOI:10.1016/j.jspi.2024.106181

Amir R. Asadi, Po-Ling Loh

{"title":"神经网络的熵正则化：自相似近似","authors":"Amir R. Asadi, Po-Ling Loh","doi":"10.1016/j.jspi.2024.106181","DOIUrl":null,"url":null,"abstract":"<div><p>This paper focuses on entropic regularization and its multiscale extension in neural network learning. We leverage established results that characterize the optimizer of entropic regularization methods and their connection with generalization bounds. To avoid the significant computational complexity involved in sampling from the optimal multiscale Gibbs distributions, we describe how to make measured concessions in optimality by using self-similar approximating distributions. We study such scale-invariant approximations for linear neural networks and further extend the approximations to neural networks with nonlinear activation functions. We then illustrate the application of our proposed approach through empirical simulation. By navigating the interplay between optimization and computational efficiency, our research contributes to entropic regularization theory, proposing a practical method that embraces symmetry across scales.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106181"},"PeriodicalIF":0.8000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000387/pdfft?md5=fcc1f48fea9b9d957df56a1c168f3f74&pid=1-s2.0-S0378375824000387-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Entropic regularization of neural networks: Self-similar approximations\",\"authors\":\"Amir R. Asadi, Po-Ling Loh\",\"doi\":\"10.1016/j.jspi.2024.106181\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper focuses on entropic regularization and its multiscale extension in neural network learning. We leverage established results that characterize the optimizer of entropic regularization methods and their connection with generalization bounds. To avoid the significant computational complexity involved in sampling from the optimal multiscale Gibbs distributions, we describe how to make measured concessions in optimality by using self-similar approximating distributions. We study such scale-invariant approximations for linear neural networks and further extend the approximations to neural networks with nonlinear activation functions. We then illustrate the application of our proposed approach through empirical simulation. By navigating the interplay between optimization and computational efficiency, our research contributes to entropic regularization theory, proposing a practical method that embraces symmetry across scales.</p></div>\",\"PeriodicalId\":50039,\"journal\":{\"name\":\"Journal of Statistical Planning and Inference\",\"volume\":\"233 \",\"pages\":\"Article 106181\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-04-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0378375824000387/pdfft?md5=fcc1f48fea9b9d957df56a1c168f3f74&pid=1-s2.0-S0378375824000387-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Statistical Planning and Inference\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378375824000387\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistical Planning and Inference","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378375824000387","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

本文重点研究神经网络学习中的熵正则化及其多尺度扩展。我们利用已有的结果来描述熵正则化方法的优化器及其与泛化边界的联系。为了避免从最优多尺度吉布斯分布中采样所带来的巨大计算复杂性，我们介绍了如何通过使用自相似近似分布，在最优性方面做出一定程度的让步。我们研究了线性神经网络的规模不变近似，并进一步将近似扩展到具有非线性激活函数的神经网络。然后，我们通过实证模拟来说明我们提出的方法的应用。通过在优化和计算效率之间的相互作用，我们的研究为熵正则化理论做出了贡献，提出了一种跨尺度对称的实用方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Entropic regularization of neural networks: Self-similar approximations

This paper focuses on entropic regularization and its multiscale extension in neural network learning. We leverage established results that characterize the optimizer of entropic regularization methods and their connection with generalization bounds. To avoid the significant computational complexity involved in sampling from the optimal multiscale Gibbs distributions, we describe how to make measured concessions in optimality by using self-similar approximating distributions. We study such scale-invariant approximations for linear neural networks and further extend the approximations to neural networks with nonlinear activation functions. We then illustrate the application of our proposed approach through empirical simulation. By navigating the interplay between optimization and computational efficiency, our research contributes to entropic regularization theory, proposing a practical method that embraces symmetry across scales.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Statistical Planning and Inference 数学-统计学与概率论

CiteScore

2.10

自引率

11.10%

发文量

审稿时长

3-6 weeks

期刊介绍： The Journal of Statistical Planning and Inference offers itself as a multifaceted and all-inclusive bridge between classical aspects of statistics and probability, and the emerging interdisciplinary aspects that have a potential of revolutionizing the subject. While we maintain our traditional strength in statistical inference, design, classical probability, and large sample methods, we also have a far more inclusive and broadened scope to keep up with the new problems that confront us as statisticians, mathematicians, and scientists. We publish high quality articles in all branches of statistics, probability, discrete mathematics, machine learning, and bioinformatics. We also especially welcome well written and up to date review articles on fundamental themes of statistics, probability, machine learning, and general biostatistics. Thoughtful letters to the editors, interesting problems in need of a solution, and short notes carrying an element of elegance or beauty are equally welcome.

期刊最新文献

The two-sample location shift model under log-concavity On cross-validated estimation of skew normal model Editorial Board Model averaging prediction for survival data with time-dependent effects Marginally constrained nonparametric Bayesian inference through Gaussian processes