{"title":"Strategies for Building Artificial Neural Network Models","authors":"R. Mahajan","doi":"10.1115/imece2000-1464","DOIUrl":null,"url":null,"abstract":"\n An artificial neural network (ANN) is a massively parallel, dynamic system of processing elements, neurons, which are connected in complicated patterns to allow for a variety of interactions among the inputs to produce the desired output. It has the ability to learn directly from example data rather than by following the programmed rules based on a knowledge base. There is virtually no limit to what an ANN can predict or decipher, so long as it has been trained properly through examples which encompass the entire range of desired predictions. This paper provides an overview of such strategies needed to build accurate ANN models. Following a general introduction to artificial neural networks, the paper will describe different techniques to build and train ANN models. Step-by-step procedures will be described to demonstrate the mechanics of building neural network models, with particular emphasis on feedforward neural networks using back-propagation learning algorithm.\n The network structure and pre-processing of data are two significant aspects of ANN model building. The former has a significant influence on the predictive capability of the network [1]. Several studies have addressed the issue of optimal network structure. Kim and May [2] use statistical experimental design to determine an optimal network for a specific application. Bhat and McAvoy [3] propose a stripping algorithm, starting with a large network and then reducing the network complexity by removing unnecessary weights/nodes. This ‘complex-to-simple’ procedure requires heavy and tedious computation. Villiers and Bernard [4] conclude that although there is no significant difference between the optimal performance of one or two hidden layer networks, single layer networks do better classification on average. Marwah et al. [5] advocate a simple-to-complex methodology in which the training starts with the simplest ANN structure. The complexity of the structure is incrementally stepped-up till an acceptable learning performance is obtained. Preprocessing of data can lead to substantial improvements in the training process. Kown et al. [6] propose a data pre-processing algorithm for a highly skewed data set. Marwah et al. [5] propose two different strategies for dealing with the data. For applications with a significant amount of historical data, smart select methodology is proposed that ensures equal weighted distribution of the data over the range of the input parameters. For applications, where there is scarcity of data or where the experiments are expensive to perform, a statistical design of experiments approach is suggested. In either case, it is shown that dividing the data into training, testing and validation ensures an accurate ANN model that has excellent predictive capabilities.\n The paper also describes recently developed concepts of physical-neural network models and model transfer techniques. In the former, an ANN model is built on the data generated through the ‘first-principles’ analytical or numerical model of the process under consideration. It is shown that such a model, termed as a physical-neural network model has the accuracy of the first-principles model but yet is orders of magnitude faster to execute. In recognition of the fact that such a model has all the approximations that are generally inherent in physical models for many complex processes, model transfer techniques have been developed [6] that allow economical development of accurate process equipment models. Examples from thermally-based materials processing will be described to illustrate the application of the basic concepts involved.","PeriodicalId":306962,"journal":{"name":"Heat Transfer: Volume 3","volume":"432 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Heat Transfer: Volume 3","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/imece2000-1464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
An artificial neural network (ANN) is a massively parallel, dynamic system of processing elements, neurons, which are connected in complicated patterns to allow for a variety of interactions among the inputs to produce the desired output. It has the ability to learn directly from example data rather than by following the programmed rules based on a knowledge base. There is virtually no limit to what an ANN can predict or decipher, so long as it has been trained properly through examples which encompass the entire range of desired predictions. This paper provides an overview of such strategies needed to build accurate ANN models. Following a general introduction to artificial neural networks, the paper will describe different techniques to build and train ANN models. Step-by-step procedures will be described to demonstrate the mechanics of building neural network models, with particular emphasis on feedforward neural networks using back-propagation learning algorithm.
The network structure and pre-processing of data are two significant aspects of ANN model building. The former has a significant influence on the predictive capability of the network [1]. Several studies have addressed the issue of optimal network structure. Kim and May [2] use statistical experimental design to determine an optimal network for a specific application. Bhat and McAvoy [3] propose a stripping algorithm, starting with a large network and then reducing the network complexity by removing unnecessary weights/nodes. This ‘complex-to-simple’ procedure requires heavy and tedious computation. Villiers and Bernard [4] conclude that although there is no significant difference between the optimal performance of one or two hidden layer networks, single layer networks do better classification on average. Marwah et al. [5] advocate a simple-to-complex methodology in which the training starts with the simplest ANN structure. The complexity of the structure is incrementally stepped-up till an acceptable learning performance is obtained. Preprocessing of data can lead to substantial improvements in the training process. Kown et al. [6] propose a data pre-processing algorithm for a highly skewed data set. Marwah et al. [5] propose two different strategies for dealing with the data. For applications with a significant amount of historical data, smart select methodology is proposed that ensures equal weighted distribution of the data over the range of the input parameters. For applications, where there is scarcity of data or where the experiments are expensive to perform, a statistical design of experiments approach is suggested. In either case, it is shown that dividing the data into training, testing and validation ensures an accurate ANN model that has excellent predictive capabilities.
The paper also describes recently developed concepts of physical-neural network models and model transfer techniques. In the former, an ANN model is built on the data generated through the ‘first-principles’ analytical or numerical model of the process under consideration. It is shown that such a model, termed as a physical-neural network model has the accuracy of the first-principles model but yet is orders of magnitude faster to execute. In recognition of the fact that such a model has all the approximations that are generally inherent in physical models for many complex processes, model transfer techniques have been developed [6] that allow economical development of accurate process equipment models. Examples from thermally-based materials processing will be described to illustrate the application of the basic concepts involved.