J. Bilski, Bartosz Kowalczyk, Marek Kisiel-Dorohinicki, A. Siwocha, J. Zurada
{"title":"Towards a Very Fast Feedforward Multilayer Neural Networks Training Algorithm","authors":"J. Bilski, Bartosz Kowalczyk, Marek Kisiel-Dorohinicki, A. Siwocha, J. Zurada","doi":"10.2478/jaiscr-2022-0012","DOIUrl":null,"url":null,"abstract":"Abstract **This paper presents a novel fast algorithm for feedforward neural networks training. It is based on the Recursive Least Squares (RLS) method commonly used for designing adaptive filters. Besides, it utilizes two techniques of linear algebra, namely the orthogonal transformation method, called the Givens Rotations (GR), and the QR decomposition, creating the GQR (symbolically we write GR + QR = GQR) procedure for solving the normal equations in the weight update process. In this paper, a novel approach to the GQR algorithm is presented. The main idea revolves around reducing the computational cost of a single rotation by eliminating the square root calculation and reducing the number of multiplications. The proposed modification is based on the scaled version of the Givens rotations, denoted as SGQR. This modification is expected to bring a significant training time reduction comparing to the classic GQR algorithm. The paper begins with the introduction and the classic Givens rotation description. Then, the scaled rotation and its usage in the QR decomposition is discussed. The main section of the article presents the neural network training algorithm which utilizes scaled Givens rotations and QR decomposition in the weight update process. Next, the experiment results of the proposed algorithm are presented and discussed. The experiment utilizes several benchmarks combined with neural networks of various topologies. It is shown that the proposed algorithm outperforms several other commonly used methods, including well known Adam optimizer.","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"181 - 195"},"PeriodicalIF":3.3000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Soft Computing Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2478/jaiscr-2022-0012","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1
Abstract
Abstract **This paper presents a novel fast algorithm for feedforward neural networks training. It is based on the Recursive Least Squares (RLS) method commonly used for designing adaptive filters. Besides, it utilizes two techniques of linear algebra, namely the orthogonal transformation method, called the Givens Rotations (GR), and the QR decomposition, creating the GQR (symbolically we write GR + QR = GQR) procedure for solving the normal equations in the weight update process. In this paper, a novel approach to the GQR algorithm is presented. The main idea revolves around reducing the computational cost of a single rotation by eliminating the square root calculation and reducing the number of multiplications. The proposed modification is based on the scaled version of the Givens rotations, denoted as SGQR. This modification is expected to bring a significant training time reduction comparing to the classic GQR algorithm. The paper begins with the introduction and the classic Givens rotation description. Then, the scaled rotation and its usage in the QR decomposition is discussed. The main section of the article presents the neural network training algorithm which utilizes scaled Givens rotations and QR decomposition in the weight update process. Next, the experiment results of the proposed algorithm are presented and discussed. The experiment utilizes several benchmarks combined with neural networks of various topologies. It is shown that the proposed algorithm outperforms several other commonly used methods, including well known Adam optimizer.
期刊介绍:
Journal of Artificial Intelligence and Soft Computing Research (available also at Sciendo (De Gruyter)) is a dynamically developing international journal focused on the latest scientific results and methods constituting traditional artificial intelligence methods and soft computing techniques. Our goal is to bring together scientists representing both approaches and various research communities.