Raphael T. Husistein, Markus Reiher, Marco Eckhoff
{"title":"NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance","authors":"Raphael T. Husistein, Markus Reiher, Marco Eckhoff","doi":"arxiv-2408.08776","DOIUrl":null,"url":null,"abstract":"Artificial neural networks have been shown to be state-of-the-art machine\nlearning models in a wide variety of applications, including natural language\nprocessing and image recognition. However, building a performant neural network\nis a laborious task and requires substantial computing power. Neural\nArchitecture Search (NAS) addresses this issue by an automatic selection of the\noptimal network from a set of potential candidates. While many NAS methods\nstill require training of (some) neural networks, zero-cost proxies promise to\nidentify the optimal network without training. In this work, we propose the\nzero-cost proxy Network Expressivity by Activation Rank (NEAR). It is based on\nthe effective rank of the pre- and post-activation matrix, i.e., the values of\na neural network layer before and after applying its activation function. We\ndemonstrate the cutting-edge correlation between this network score and the\nmodel accuracy on NAS-Bench-101 and NATS-Bench-SSS/TSS. In addition, we present\na simple approach to estimate the optimal layer sizes in multi-layer\nperceptrons. Furthermore, we show that this score can be utilized to select\nhyperparameters such as the activation function and the neural network weight\ninitialization scheme.","PeriodicalId":501065,"journal":{"name":"arXiv - PHYS - Data Analysis, Statistics and Probability","volume":"15 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Data Analysis, Statistics and Probability","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.08776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial neural networks have been shown to be state-of-the-art machine
learning models in a wide variety of applications, including natural language
processing and image recognition. However, building a performant neural network
is a laborious task and requires substantial computing power. Neural
Architecture Search (NAS) addresses this issue by an automatic selection of the
optimal network from a set of potential candidates. While many NAS methods
still require training of (some) neural networks, zero-cost proxies promise to
identify the optimal network without training. In this work, we propose the
zero-cost proxy Network Expressivity by Activation Rank (NEAR). It is based on
the effective rank of the pre- and post-activation matrix, i.e., the values of
a neural network layer before and after applying its activation function. We
demonstrate the cutting-edge correlation between this network score and the
model accuracy on NAS-Bench-101 and NATS-Bench-SSS/TSS. In addition, we present
a simple approach to estimate the optimal layer sizes in multi-layer
perceptrons. Furthermore, we show that this score can be utilized to select
hyperparameters such as the activation function and the neural network weight
initialization scheme.