{"title":"神经熵","authors":"Akhil Premkumar","doi":"arxiv-2409.03817","DOIUrl":null,"url":null,"abstract":"We examine the connection between deep learning and information theory\nthrough the paradigm of diffusion models. Using well-established principles\nfrom non-equilibrium thermodynamics we can characterize the amount of\ninformation required to reverse a diffusive process. Neural networks store this\ninformation and operate in a manner reminiscent of Maxwell's demon during the\ngenerative stage. We illustrate this cycle using a novel diffusion scheme we\ncall the entropy matching model, wherein the information conveyed to the\nnetwork during training exactly corresponds to the entropy that must be negated\nduring reversal. We demonstrate that this entropy can be used to analyze the\nencoding efficiency and storage capacity of the network. This conceptual\npicture blends elements of stochastic optimal control, thermodynamics,\ninformation theory, and optimal transport, and raises the prospect of applying\ndiffusion models as a test bench to understand neural networks.","PeriodicalId":501082,"journal":{"name":"arXiv - MATH - Information Theory","volume":"56 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Neural Entropy\",\"authors\":\"Akhil Premkumar\",\"doi\":\"arxiv-2409.03817\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We examine the connection between deep learning and information theory\\nthrough the paradigm of diffusion models. Using well-established principles\\nfrom non-equilibrium thermodynamics we can characterize the amount of\\ninformation required to reverse a diffusive process. Neural networks store this\\ninformation and operate in a manner reminiscent of Maxwell's demon during the\\ngenerative stage. We illustrate this cycle using a novel diffusion scheme we\\ncall the entropy matching model, wherein the information conveyed to the\\nnetwork during training exactly corresponds to the entropy that must be negated\\nduring reversal. We demonstrate that this entropy can be used to analyze the\\nencoding efficiency and storage capacity of the network. This conceptual\\npicture blends elements of stochastic optimal control, thermodynamics,\\ninformation theory, and optimal transport, and raises the prospect of applying\\ndiffusion models as a test bench to understand neural networks.\",\"PeriodicalId\":501082,\"journal\":{\"name\":\"arXiv - MATH - Information Theory\",\"volume\":\"56 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03817\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
We examine the connection between deep learning and information theory
through the paradigm of diffusion models. Using well-established principles
from non-equilibrium thermodynamics we can characterize the amount of
information required to reverse a diffusive process. Neural networks store this
information and operate in a manner reminiscent of Maxwell's demon during the
generative stage. We illustrate this cycle using a novel diffusion scheme we
call the entropy matching model, wherein the information conveyed to the
network during training exactly corresponds to the entropy that must be negated
during reversal. We demonstrate that this entropy can be used to analyze the
encoding efficiency and storage capacity of the network. This conceptual
picture blends elements of stochastic optimal control, thermodynamics,
information theory, and optimal transport, and raises the prospect of applying
diffusion models as a test bench to understand neural networks.