{"title":"物理、数学和深度学习之间的联系。","authors":"Jean Thierry-Mieg","doi":"10.31526/lhep.3.2019.110","DOIUrl":null,"url":null,"abstract":"<p><p>Starting from Fermat's principle of least action, which governs classical and quantum mechanics and from the theory of exterior differential forms, which governs the geometry of curved manifolds, we show how to derive the equations governing neural networks in an intrinsic, coordinate-invariant way, where the loss function plays the role of the Hamiltonian. To be covariant, these equations imply a layer metric which is instrumental in pretraining and explains the role of conjugation when using complex numbers. The differential formalism clarifies the relation of the gradient descent optimizer with Aristotelian and Newtonian mechanics. The Bayesian paradigm is then analyzed as a renormalizable theory yielding a new derivation of the Bayesian information criterion. We hope that this formal presentation of the differential geometry of neural networks will encourage some physicists to dive into deep learning and, reciprocally, that the specialists of deep learning will better appreciate the close interconnection of their subject with the foundations of classical and quantum field theory.</p>","PeriodicalId":36085,"journal":{"name":"Letters in High Energy Physics","volume":"2 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8462849/pdf/nihms-1732645.pdf","citationCount":"0","resultStr":"{\"title\":\"Connections between physics, mathematics, and deep learning.\",\"authors\":\"Jean Thierry-Mieg\",\"doi\":\"10.31526/lhep.3.2019.110\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Starting from Fermat's principle of least action, which governs classical and quantum mechanics and from the theory of exterior differential forms, which governs the geometry of curved manifolds, we show how to derive the equations governing neural networks in an intrinsic, coordinate-invariant way, where the loss function plays the role of the Hamiltonian. To be covariant, these equations imply a layer metric which is instrumental in pretraining and explains the role of conjugation when using complex numbers. The differential formalism clarifies the relation of the gradient descent optimizer with Aristotelian and Newtonian mechanics. The Bayesian paradigm is then analyzed as a renormalizable theory yielding a new derivation of the Bayesian information criterion. We hope that this formal presentation of the differential geometry of neural networks will encourage some physicists to dive into deep learning and, reciprocally, that the specialists of deep learning will better appreciate the close interconnection of their subject with the foundations of classical and quantum field theory.</p>\",\"PeriodicalId\":36085,\"journal\":{\"name\":\"Letters in High Energy Physics\",\"volume\":\"2 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8462849/pdf/nihms-1732645.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Letters in High Energy Physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31526/lhep.3.2019.110\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Physics and Astronomy\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Letters in High Energy Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31526/lhep.3.2019.110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Physics and Astronomy","Score":null,"Total":0}
Connections between physics, mathematics, and deep learning.
Starting from Fermat's principle of least action, which governs classical and quantum mechanics and from the theory of exterior differential forms, which governs the geometry of curved manifolds, we show how to derive the equations governing neural networks in an intrinsic, coordinate-invariant way, where the loss function plays the role of the Hamiltonian. To be covariant, these equations imply a layer metric which is instrumental in pretraining and explains the role of conjugation when using complex numbers. The differential formalism clarifies the relation of the gradient descent optimizer with Aristotelian and Newtonian mechanics. The Bayesian paradigm is then analyzed as a renormalizable theory yielding a new derivation of the Bayesian information criterion. We hope that this formal presentation of the differential geometry of neural networks will encourage some physicists to dive into deep learning and, reciprocally, that the specialists of deep learning will better appreciate the close interconnection of their subject with the foundations of classical and quantum field theory.