Hoang V. Tran, Thieu N. Vo, Tho H. Tran, An T. Nguyen, Tan Minh Nguyen
{"title":"Monomial Matrix Group Equivariant Neural Functional Networks","authors":"Hoang V. Tran, Thieu N. Vo, Tho H. Tran, An T. Nguyen, Tan Minh Nguyen","doi":"arxiv-2409.11697","DOIUrl":null,"url":null,"abstract":"Neural functional networks (NFNs) have recently gained significant attention\ndue to their diverse applications, ranging from predicting network\ngeneralization and network editing to classifying implicit neural\nrepresentation. Previous NFN designs often depend on permutation symmetries in\nneural networks' weights, which traditionally arise from the unordered\narrangement of neurons in hidden layers. However, these designs do not take\ninto account the weight scaling symmetries of $\\operatorname{ReLU}$ networks,\nand the weight sign flipping symmetries of $\\operatorname{sin}$ or\n$\\operatorname{tanh}$ networks. In this paper, we extend the study of the group\naction on the network weights from the group of permutation matrices to the\ngroup of monomial matrices by incorporating scaling/sign-flipping symmetries.\nParticularly, we encode these scaling/sign-flipping symmetries by designing our\ncorresponding equivariant and invariant layers. We name our new family of NFNs\nthe Monomial Matrix Group Equivariant Neural Functional Networks\n(Monomial-NFN). Because of the expansion of the symmetries, Monomial-NFN has\nmuch fewer independent trainable parameters compared to the baseline NFNs in\nthe literature, thus enhancing the model's efficiency. Moreover, for fully\nconnected and convolutional neural networks, we theoretically prove that all\ngroups that leave these networks invariant while acting on their weight spaces\nare some subgroups of the monomial matrix group. We provide empirical evidences\nto demonstrate the advantages of our model over existing baselines, achieving\ncompetitive performance and efficiency.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11697","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Neural functional networks (NFNs) have recently gained significant attention
due to their diverse applications, ranging from predicting network
generalization and network editing to classifying implicit neural
representation. Previous NFN designs often depend on permutation symmetries in
neural networks' weights, which traditionally arise from the unordered
arrangement of neurons in hidden layers. However, these designs do not take
into account the weight scaling symmetries of $\operatorname{ReLU}$ networks,
and the weight sign flipping symmetries of $\operatorname{sin}$ or
$\operatorname{tanh}$ networks. In this paper, we extend the study of the group
action on the network weights from the group of permutation matrices to the
group of monomial matrices by incorporating scaling/sign-flipping symmetries.
Particularly, we encode these scaling/sign-flipping symmetries by designing our
corresponding equivariant and invariant layers. We name our new family of NFNs
the Monomial Matrix Group Equivariant Neural Functional Networks
(Monomial-NFN). Because of the expansion of the symmetries, Monomial-NFN has
much fewer independent trainable parameters compared to the baseline NFNs in
the literature, thus enhancing the model's efficiency. Moreover, for fully
connected and convolutional neural networks, we theoretically prove that all
groups that leave these networks invariant while acting on their weight spaces
are some subgroups of the monomial matrix group. We provide empirical evidences
to demonstrate the advantages of our model over existing baselines, achieving
competitive performance and efficiency.