{"title":"构建人工神经网络模型的策略","authors":"R. Mahajan","doi":"10.1115/imece2000-1464","DOIUrl":null,"url":null,"abstract":"\n An artificial neural network (ANN) is a massively parallel, dynamic system of processing elements, neurons, which are connected in complicated patterns to allow for a variety of interactions among the inputs to produce the desired output. It has the ability to learn directly from example data rather than by following the programmed rules based on a knowledge base. There is virtually no limit to what an ANN can predict or decipher, so long as it has been trained properly through examples which encompass the entire range of desired predictions. This paper provides an overview of such strategies needed to build accurate ANN models. Following a general introduction to artificial neural networks, the paper will describe different techniques to build and train ANN models. Step-by-step procedures will be described to demonstrate the mechanics of building neural network models, with particular emphasis on feedforward neural networks using back-propagation learning algorithm.\n The network structure and pre-processing of data are two significant aspects of ANN model building. The former has a significant influence on the predictive capability of the network [1]. Several studies have addressed the issue of optimal network structure. Kim and May [2] use statistical experimental design to determine an optimal network for a specific application. Bhat and McAvoy [3] propose a stripping algorithm, starting with a large network and then reducing the network complexity by removing unnecessary weights/nodes. This ‘complex-to-simple’ procedure requires heavy and tedious computation. Villiers and Bernard [4] conclude that although there is no significant difference between the optimal performance of one or two hidden layer networks, single layer networks do better classification on average. Marwah et al. [5] advocate a simple-to-complex methodology in which the training starts with the simplest ANN structure. The complexity of the structure is incrementally stepped-up till an acceptable learning performance is obtained. Preprocessing of data can lead to substantial improvements in the training process. Kown et al. [6] propose a data pre-processing algorithm for a highly skewed data set. Marwah et al. [5] propose two different strategies for dealing with the data. For applications with a significant amount of historical data, smart select methodology is proposed that ensures equal weighted distribution of the data over the range of the input parameters. For applications, where there is scarcity of data or where the experiments are expensive to perform, a statistical design of experiments approach is suggested. In either case, it is shown that dividing the data into training, testing and validation ensures an accurate ANN model that has excellent predictive capabilities.\n The paper also describes recently developed concepts of physical-neural network models and model transfer techniques. In the former, an ANN model is built on the data generated through the ‘first-principles’ analytical or numerical model of the process under consideration. It is shown that such a model, termed as a physical-neural network model has the accuracy of the first-principles model but yet is orders of magnitude faster to execute. In recognition of the fact that such a model has all the approximations that are generally inherent in physical models for many complex processes, model transfer techniques have been developed [6] that allow economical development of accurate process equipment models. Examples from thermally-based materials processing will be described to illustrate the application of the basic concepts involved.","PeriodicalId":306962,"journal":{"name":"Heat Transfer: Volume 3","volume":"432 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Strategies for Building Artificial Neural Network Models\",\"authors\":\"R. Mahajan\",\"doi\":\"10.1115/imece2000-1464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n An artificial neural network (ANN) is a massively parallel, dynamic system of processing elements, neurons, which are connected in complicated patterns to allow for a variety of interactions among the inputs to produce the desired output. It has the ability to learn directly from example data rather than by following the programmed rules based on a knowledge base. There is virtually no limit to what an ANN can predict or decipher, so long as it has been trained properly through examples which encompass the entire range of desired predictions. This paper provides an overview of such strategies needed to build accurate ANN models. Following a general introduction to artificial neural networks, the paper will describe different techniques to build and train ANN models. Step-by-step procedures will be described to demonstrate the mechanics of building neural network models, with particular emphasis on feedforward neural networks using back-propagation learning algorithm.\\n The network structure and pre-processing of data are two significant aspects of ANN model building. The former has a significant influence on the predictive capability of the network [1]. Several studies have addressed the issue of optimal network structure. Kim and May [2] use statistical experimental design to determine an optimal network for a specific application. Bhat and McAvoy [3] propose a stripping algorithm, starting with a large network and then reducing the network complexity by removing unnecessary weights/nodes. This ‘complex-to-simple’ procedure requires heavy and tedious computation. Villiers and Bernard [4] conclude that although there is no significant difference between the optimal performance of one or two hidden layer networks, single layer networks do better classification on average. Marwah et al. [5] advocate a simple-to-complex methodology in which the training starts with the simplest ANN structure. The complexity of the structure is incrementally stepped-up till an acceptable learning performance is obtained. Preprocessing of data can lead to substantial improvements in the training process. Kown et al. [6] propose a data pre-processing algorithm for a highly skewed data set. Marwah et al. [5] propose two different strategies for dealing with the data. For applications with a significant amount of historical data, smart select methodology is proposed that ensures equal weighted distribution of the data over the range of the input parameters. For applications, where there is scarcity of data or where the experiments are expensive to perform, a statistical design of experiments approach is suggested. In either case, it is shown that dividing the data into training, testing and validation ensures an accurate ANN model that has excellent predictive capabilities.\\n The paper also describes recently developed concepts of physical-neural network models and model transfer techniques. In the former, an ANN model is built on the data generated through the ‘first-principles’ analytical or numerical model of the process under consideration. It is shown that such a model, termed as a physical-neural network model has the accuracy of the first-principles model but yet is orders of magnitude faster to execute. In recognition of the fact that such a model has all the approximations that are generally inherent in physical models for many complex processes, model transfer techniques have been developed [6] that allow economical development of accurate process equipment models. Examples from thermally-based materials processing will be described to illustrate the application of the basic concepts involved.\",\"PeriodicalId\":306962,\"journal\":{\"name\":\"Heat Transfer: Volume 3\",\"volume\":\"432 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Heat Transfer: Volume 3\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/imece2000-1464\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Heat Transfer: Volume 3","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/imece2000-1464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
人工神经网络(ANN)是一个由处理元件(神经元)组成的大规模并行动态系统,这些神经元以复杂的模式连接在一起,允许输入之间进行各种交互,以产生所需的输出。它能够直接从示例数据中学习,而不是遵循基于知识库的编程规则。只要通过包含整个所需预测范围的示例进行适当的训练,ANN 的预测或破译能力几乎没有限制。本文概述了建立准确的人工神经网络模型所需的策略。在对人工神经网络进行一般性介绍后,本文将介绍建立和训练人工神经网络模型的不同技术。本文将分步介绍建立神经网络模型的机制,特别强调使用反向传播学习算法的前馈神经网络。网络结构和数据预处理是建立神经网络模型的两个重要方面。前者对网络的预测能力有重大影响 [1]。有几项研究探讨了最佳网络结构的问题。Kim 和 May [2] 使用统计实验设计来确定特定应用的最佳网络。Bhat 和 McAvoy [3] 提出了一种剥离算法,从一个大型网络开始,通过去除不必要的权重/节点来降低网络复杂度。这种 "化繁为简 "的过程需要繁重而乏味的计算。Villiers 和 Bernard [4] 认为,虽然单层或双层隐藏网络的最佳性能没有显著差异,但单层网络的平均分类效果更好。Marwah 等人[5]主张采用从简单到复杂的方法,即从最简单的 ANN 结构开始训练。然后逐步提高结构的复杂度,直到获得可接受的学习效果。对数据进行预处理可以大大改进训练过程。Kown 等人 [6] 提出了一种针对高度倾斜数据集的数据预处理算法。Marwah 等人[5]提出了两种不同的数据处理策略。对于具有大量历史数据的应用,提出了智能选择方法,确保数据在输入参数范围内的等权分布。对于数据稀缺或实验成本高昂的应用,建议采用统计实验设计方法。无论在哪种情况下,实验都表明,将数据分为训练、测试和验证,可确保建立具有出色预测能力的精确 ANN 模型。本文还介绍了最近开发的物理-神经网络模型概念和模型转移技术。在物理-神经网络模型中,ANN 模型是根据所考虑过程的 "第一原理 "分析或数值模型生成的数据建立的。结果表明,这种被称为物理神经网络模型的模型具有第一原理模型的准确性,但执行速度却要快上几个数量级。由于这种模型具有许多复杂过程物理模型通常固有的所有近似值,因此开发了模型转移技术[6],可以经济地开发精确的过程设备模型。我们将以热基材料加工为例,说明相关基本概念的应用。
Strategies for Building Artificial Neural Network Models
An artificial neural network (ANN) is a massively parallel, dynamic system of processing elements, neurons, which are connected in complicated patterns to allow for a variety of interactions among the inputs to produce the desired output. It has the ability to learn directly from example data rather than by following the programmed rules based on a knowledge base. There is virtually no limit to what an ANN can predict or decipher, so long as it has been trained properly through examples which encompass the entire range of desired predictions. This paper provides an overview of such strategies needed to build accurate ANN models. Following a general introduction to artificial neural networks, the paper will describe different techniques to build and train ANN models. Step-by-step procedures will be described to demonstrate the mechanics of building neural network models, with particular emphasis on feedforward neural networks using back-propagation learning algorithm.
The network structure and pre-processing of data are two significant aspects of ANN model building. The former has a significant influence on the predictive capability of the network [1]. Several studies have addressed the issue of optimal network structure. Kim and May [2] use statistical experimental design to determine an optimal network for a specific application. Bhat and McAvoy [3] propose a stripping algorithm, starting with a large network and then reducing the network complexity by removing unnecessary weights/nodes. This ‘complex-to-simple’ procedure requires heavy and tedious computation. Villiers and Bernard [4] conclude that although there is no significant difference between the optimal performance of one or two hidden layer networks, single layer networks do better classification on average. Marwah et al. [5] advocate a simple-to-complex methodology in which the training starts with the simplest ANN structure. The complexity of the structure is incrementally stepped-up till an acceptable learning performance is obtained. Preprocessing of data can lead to substantial improvements in the training process. Kown et al. [6] propose a data pre-processing algorithm for a highly skewed data set. Marwah et al. [5] propose two different strategies for dealing with the data. For applications with a significant amount of historical data, smart select methodology is proposed that ensures equal weighted distribution of the data over the range of the input parameters. For applications, where there is scarcity of data or where the experiments are expensive to perform, a statistical design of experiments approach is suggested. In either case, it is shown that dividing the data into training, testing and validation ensures an accurate ANN model that has excellent predictive capabilities.
The paper also describes recently developed concepts of physical-neural network models and model transfer techniques. In the former, an ANN model is built on the data generated through the ‘first-principles’ analytical or numerical model of the process under consideration. It is shown that such a model, termed as a physical-neural network model has the accuracy of the first-principles model but yet is orders of magnitude faster to execute. In recognition of the fact that such a model has all the approximations that are generally inherent in physical models for many complex processes, model transfer techniques have been developed [6] that allow economical development of accurate process equipment models. Examples from thermally-based materials processing will be described to illustrate the application of the basic concepts involved.