Bo Wang , Heng Yuan , Lizuo Liu , Wenzhong Zhang , Wei Cai
{"title":"On spectral bias reduction of multi-scale neural networks for regression problems","authors":"Bo Wang , Heng Yuan , Lizuo Liu , Wenzhong Zhang , Wei Cai","doi":"10.1016/j.neunet.2025.107179","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, we derive diffusion equation models in the spectral domain to study the evolution of the training error of two-layer multiscale deep neural networks (MscaleDNN) (Cai and Xu, 2019; Liu et al., 2020), which is designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN’s spectral bias reduction capability. The diffusion model in the Fourier-spectral domain allows us to understand clearly the training error decay for different Fourier-frequencies. The numerical results of the diffusion models for a two-layer MscaleDNN training match the error evolution of the actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using MscaleDNN to approximate functions with a wide range of frequencies.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107179"},"PeriodicalIF":6.3000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025000589","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/21 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we derive diffusion equation models in the spectral domain to study the evolution of the training error of two-layer multiscale deep neural networks (MscaleDNN) (Cai and Xu, 2019; Liu et al., 2020), which is designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN’s spectral bias reduction capability. The diffusion model in the Fourier-spectral domain allows us to understand clearly the training error decay for different Fourier-frequencies. The numerical results of the diffusion models for a two-layer MscaleDNN training match the error evolution of the actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using MscaleDNN to approximate functions with a wide range of frequencies.
在本文中,我们推导了谱域中的扩散方程模型,以研究两层多尺度深度神经网络(MscaleDNN)的训练误差演变(Cai and Xu, 2019;Liu et al., 2020),该方法旨在减少全连接深度神经网络在近似振荡函数时的频谱偏差。扩散模型由MscaleDNN误差方程的谱形式得到,采用神经切线核方法和梯度下降训练以及正弦激活函数推导,假设学习率消失,网络宽度和域大小无限。如果在MscaleDNN中使用更多的尺度,则所涉及的扩散系数具有更大的支撑,因此,所提出的频域扩散方程模型解释了MscaleDNN的频谱偏置减小能力。傅里叶-谱域的扩散模型使我们能够清楚地理解不同傅里叶频率下的训练误差衰减。两层MscaleDNN训练扩散模型的数值结果与实际梯度下降训练在相当大的网络宽度下的误差演变相匹配,从而验证了扩散模型的有效性。同时,MscaleDNN的数值计算结果表明,误差在较宽的频率范围内衰减,证实了使用MscaleDNN逼近宽频率范围函数的优势。
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.