Tobias Alt, Karl Schrader, Matthias Augustin, Pascal Peter, Joachim Weickert
{"title":"PDE 数值算法与神经网络之间的联系。","authors":"Tobias Alt, Karl Schrader, Matthias Augustin, Pascal Peter, Joachim Weickert","doi":"10.1007/s10851-022-01106-x","DOIUrl":null,"url":null,"abstract":"<p><p>We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural architectures. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks. Besides structural insights, we provide concrete examples and experimental evaluations of the resulting architectures. Using the example of generalised nonlinear diffusion in 1D, we consider explicit schemes, acceleration strategies thereof, implicit schemes, and multigrid approaches. We connect these concepts to residual networks, recurrent neural networks, and U-net architectures. Our findings inspire a symmetric residual network design with provable stability guarantees and justify the effectiveness of skip connections in neural networks from a numerical perspective. Moreover, we present U-net architectures that implement multigrid techniques for learning efficient solutions of partial differential equation models, and motivate uncommon design choices such as trainable nonmonotone activation functions. Experimental evaluations show that the proposed architectures save half of the trainable parameters and can thus outperform standard ones with the same model complexity. Our considerations serve as a basis for explaining the success of popular neural architectures and provide a blueprint for developing new mathematically well-founded neural building blocks.</p>","PeriodicalId":16196,"journal":{"name":"Journal of Mathematical Imaging and Vision","volume":null,"pages":null},"PeriodicalIF":1.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883332/pdf/","citationCount":"0","resultStr":"{\"title\":\"Connections Between Numerical Algorithms for PDEs and Neural Networks.\",\"authors\":\"Tobias Alt, Karl Schrader, Matthias Augustin, Pascal Peter, Joachim Weickert\",\"doi\":\"10.1007/s10851-022-01106-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural architectures. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks. Besides structural insights, we provide concrete examples and experimental evaluations of the resulting architectures. Using the example of generalised nonlinear diffusion in 1D, we consider explicit schemes, acceleration strategies thereof, implicit schemes, and multigrid approaches. We connect these concepts to residual networks, recurrent neural networks, and U-net architectures. Our findings inspire a symmetric residual network design with provable stability guarantees and justify the effectiveness of skip connections in neural networks from a numerical perspective. Moreover, we present U-net architectures that implement multigrid techniques for learning efficient solutions of partial differential equation models, and motivate uncommon design choices such as trainable nonmonotone activation functions. Experimental evaluations show that the proposed architectures save half of the trainable parameters and can thus outperform standard ones with the same model complexity. Our considerations serve as a basis for explaining the success of popular neural architectures and provide a blueprint for developing new mathematically well-founded neural building blocks.</p>\",\"PeriodicalId\":16196,\"journal\":{\"name\":\"Journal of Mathematical Imaging and Vision\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883332/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Mathematical Imaging and Vision\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10851-022-01106-x\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/6/24 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Mathematical Imaging and Vision","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10851-022-01106-x","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/6/24 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Connections Between Numerical Algorithms for PDEs and Neural Networks.
We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural architectures. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks. Besides structural insights, we provide concrete examples and experimental evaluations of the resulting architectures. Using the example of generalised nonlinear diffusion in 1D, we consider explicit schemes, acceleration strategies thereof, implicit schemes, and multigrid approaches. We connect these concepts to residual networks, recurrent neural networks, and U-net architectures. Our findings inspire a symmetric residual network design with provable stability guarantees and justify the effectiveness of skip connections in neural networks from a numerical perspective. Moreover, we present U-net architectures that implement multigrid techniques for learning efficient solutions of partial differential equation models, and motivate uncommon design choices such as trainable nonmonotone activation functions. Experimental evaluations show that the proposed architectures save half of the trainable parameters and can thus outperform standard ones with the same model complexity. Our considerations serve as a basis for explaining the success of popular neural architectures and provide a blueprint for developing new mathematically well-founded neural building blocks.
期刊介绍:
The Journal of Mathematical Imaging and Vision is a technical journal publishing important new developments in mathematical imaging. The journal publishes research articles, invited papers, and expository articles.
Current developments in new image processing hardware, the advent of multisensor data fusion, and rapid advances in vision research have led to an explosive growth in the interdisciplinary field of imaging science. This growth has resulted in the development of highly sophisticated mathematical models and theories. The journal emphasizes the role of mathematics as a rigorous basis for imaging science. This provides a sound alternative to present journals in this area. Contributions are judged on the basis of mathematical content. Articles may be physically speculative but need to be mathematically sound. Emphasis is placed on innovative or established mathematical techniques applied to vision and imaging problems in a novel way, as well as new developments and problems in mathematics arising from these applications.
The scope of the journal includes:
computational models of vision; imaging algebra and mathematical morphology
mathematical methods in reconstruction, compactification, and coding
filter theory
probabilistic, statistical, geometric, topological, and fractal techniques and models in imaging science
inverse optics
wave theory.
Specific application areas of interest include, but are not limited to:
all aspects of image formation and representation
medical, biological, industrial, geophysical, astronomical and military imaging
image analysis and image understanding
parallel and distributed computing
computer vision architecture design.