{"title":"受限玻尔兹曼机上的无数据集权重初始化","authors":"Muneki Yasuda, Ryosuke Maeno, Chako Takahashi","doi":"arxiv-2409.07708","DOIUrl":null,"url":null,"abstract":"In feed-forward neural networks, dataset-free weight-initialization method\nsuch as LeCun, Xavier (or Glorot), and He initializations have been developed.\nThese methods randomly determine the initial values of weight parameters based\non specific distributions (e.g., Gaussian or uniform distributions) without\nusing training datasets. To the best of the authors' knowledge, such a\ndataset-free weight-initialization method is yet to be developed for restricted\nBoltzmann machines (RBMs), which are probabilistic neural networks consisting\nof two layers, In this study, we derive a dataset-free weight-initialization\nmethod for Bernoulli--Bernoulli RBMs based on a statistical mechanical\nanalysis. In the proposed weight-initialization method, the weight parameters\nare drawn from a Gaussian distribution with zero mean. The standard deviation\nof the Gaussian distribution is optimized based on our hypothesis which is that\na standard deviation providing a larger layer correlation (LC) between the two\nlayers improves the learning efficiency. The expression of the LC is derived\nbased on a statistical mechanical analysis. The optimal value of the standard\ndeviation corresponds to the maximum point of the LC. The proposed\nweight-initialization method is identical to Xavier initialization in a\nspecific case (i.e., in the case the sizes of the two layers are the same, the\nrandom variables of the layers are $\\{-1,1\\}$-binary, and all bias parameters\nare zero).","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dataset-Free Weight-Initialization on Restricted Boltzmann Machine\",\"authors\":\"Muneki Yasuda, Ryosuke Maeno, Chako Takahashi\",\"doi\":\"arxiv-2409.07708\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In feed-forward neural networks, dataset-free weight-initialization method\\nsuch as LeCun, Xavier (or Glorot), and He initializations have been developed.\\nThese methods randomly determine the initial values of weight parameters based\\non specific distributions (e.g., Gaussian or uniform distributions) without\\nusing training datasets. To the best of the authors' knowledge, such a\\ndataset-free weight-initialization method is yet to be developed for restricted\\nBoltzmann machines (RBMs), which are probabilistic neural networks consisting\\nof two layers, In this study, we derive a dataset-free weight-initialization\\nmethod for Bernoulli--Bernoulli RBMs based on a statistical mechanical\\nanalysis. In the proposed weight-initialization method, the weight parameters\\nare drawn from a Gaussian distribution with zero mean. The standard deviation\\nof the Gaussian distribution is optimized based on our hypothesis which is that\\na standard deviation providing a larger layer correlation (LC) between the two\\nlayers improves the learning efficiency. The expression of the LC is derived\\nbased on a statistical mechanical analysis. The optimal value of the standard\\ndeviation corresponds to the maximum point of the LC. The proposed\\nweight-initialization method is identical to Xavier initialization in a\\nspecific case (i.e., in the case the sizes of the two layers are the same, the\\nrandom variables of the layers are $\\\\{-1,1\\\\}$-binary, and all bias parameters\\nare zero).\",\"PeriodicalId\":501340,\"journal\":{\"name\":\"arXiv - STAT - Machine Learning\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07708\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07708","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dataset-Free Weight-Initialization on Restricted Boltzmann Machine
In feed-forward neural networks, dataset-free weight-initialization method
such as LeCun, Xavier (or Glorot), and He initializations have been developed.
These methods randomly determine the initial values of weight parameters based
on specific distributions (e.g., Gaussian or uniform distributions) without
using training datasets. To the best of the authors' knowledge, such a
dataset-free weight-initialization method is yet to be developed for restricted
Boltzmann machines (RBMs), which are probabilistic neural networks consisting
of two layers, In this study, we derive a dataset-free weight-initialization
method for Bernoulli--Bernoulli RBMs based on a statistical mechanical
analysis. In the proposed weight-initialization method, the weight parameters
are drawn from a Gaussian distribution with zero mean. The standard deviation
of the Gaussian distribution is optimized based on our hypothesis which is that
a standard deviation providing a larger layer correlation (LC) between the two
layers improves the learning efficiency. The expression of the LC is derived
based on a statistical mechanical analysis. The optimal value of the standard
deviation corresponds to the maximum point of the LC. The proposed
weight-initialization method is identical to Xavier initialization in a
specific case (i.e., in the case the sizes of the two layers are the same, the
random variables of the layers are $\{-1,1\}$-binary, and all bias parameters
are zero).