Pub Date : 2011-12-01Epub Date: 2011-11-23DOI: 10.1109/TNN.2011.2176349
Luis Diago, Tetsuko Kitaoka, Ichiro Hagiwara, Toshiki Kambayashi
Artificial neural networks are nonlinear techniques which typically provide one of the most accurate predictive models perceiving faces in terms of the social impressions they make on people. However, they are often not suitable to be used in many practical application domains because of their lack of transparency and comprehensibility. This paper proposes a new neuro-fuzzy method to investigate the characteristics of the facial images perceived as Iyashi by one hundred and fourteen subjects. Iyashi is a Japanese word used to describe a peculiar phenomenon that is mentally soothing, but is yet to be clearly defined. In order to gain a clear insight into the reasoning made by the nonlinear prediction models such as holographic neural networks (HNN) in the classification of Iyashi expressions, the interpretability of the proposed fuzzy-quantized HNN (FQHNN) is improved by reducing the number of input parameters, creating membership functions and extracting fuzzy rules from the responses provided by the subjects about a limited dataset of 20 facial images. The experimental results show that the proposed FQHNN achieves 2-8% increase in the prediction accuracy compared with traditional neuro-fuzzy classifiers while it extracts 35 fuzzy rules explaining what characteristics a facial image should have in order to be classified as Iyashi-stimulus for 87 subjects.
{"title":"Neuro-fuzzy quantification of personal perceptions of facial images based on a limited data set.","authors":"Luis Diago, Tetsuko Kitaoka, Ichiro Hagiwara, Toshiki Kambayashi","doi":"10.1109/TNN.2011.2176349","DOIUrl":"https://doi.org/10.1109/TNN.2011.2176349","url":null,"abstract":"<p><p>Artificial neural networks are nonlinear techniques which typically provide one of the most accurate predictive models perceiving faces in terms of the social impressions they make on people. However, they are often not suitable to be used in many practical application domains because of their lack of transparency and comprehensibility. This paper proposes a new neuro-fuzzy method to investigate the characteristics of the facial images perceived as Iyashi by one hundred and fourteen subjects. Iyashi is a Japanese word used to describe a peculiar phenomenon that is mentally soothing, but is yet to be clearly defined. In order to gain a clear insight into the reasoning made by the nonlinear prediction models such as holographic neural networks (HNN) in the classification of Iyashi expressions, the interpretability of the proposed fuzzy-quantized HNN (FQHNN) is improved by reducing the number of input parameters, creating membership functions and extracting fuzzy rules from the responses provided by the subjects about a limited dataset of 20 facial images. The experimental results show that the proposed FQHNN achieves 2-8% increase in the prediction accuracy compared with traditional neuro-fuzzy classifiers while it extracts 35 fuzzy rules explaining what characteristics a facial image should have in order to be classified as Iyashi-stimulus for 87 subjects.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2422-34"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2176349","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30290089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-11-29DOI: 10.1109/TNN.2011.2174444
Puya Afshar, Martin Brown, Jan Maciejowski, Hong Wang
Reducing energy consumption is a major challenge for "energy-intensive" industries such as papermaking. A commercially viable energy saving solution is to employ data-based optimization techniques to obtain a set of "optimized" operational settings that satisfy certain performance indices. The difficulties of this are: 1) the problems of this type are inherently multicriteria in the sense that improving one performance index might result in compromising the other important measures; 2) practical systems often exhibit unknown complex dynamics and several interconnections which make the modeling task difficult; and 3) as the models are acquired from the existing historical data, they are valid only locally and extrapolations incorporate risk of increasing process variability. To overcome these difficulties, this paper presents a new decision support system for robust multiobjective optimization of interconnected processes. The plant is first divided into serially connected units to model the process, product quality, energy consumption, and corresponding uncertainty measures. Then multiobjective gradient descent algorithm is used to solve the problem in line with user's preference information. Finally, the optimization results are visualized for analysis and decision making. In practice, if further iterations of the optimization algorithm are considered, validity of the local models must be checked prior to proceeding to further iterations. The method is implemented by a MATLAB-based interactive tool DataExplorer supporting a range of data analysis, modeling, and multiobjective optimization techniques. The proposed approach was tested in two U.K.-based commercial paper mills where the aim was reducing steam consumption and increasing productivity while maintaining the product quality by optimization of vacuum pressures in forming and press sections. The experimental results demonstrate the effectiveness of the method.
{"title":"Data-based robust multiobjective optimization of interconnected processes: energy efficiency case study in papermaking.","authors":"Puya Afshar, Martin Brown, Jan Maciejowski, Hong Wang","doi":"10.1109/TNN.2011.2174444","DOIUrl":"https://doi.org/10.1109/TNN.2011.2174444","url":null,"abstract":"<p><p>Reducing energy consumption is a major challenge for \"energy-intensive\" industries such as papermaking. A commercially viable energy saving solution is to employ data-based optimization techniques to obtain a set of \"optimized\" operational settings that satisfy certain performance indices. The difficulties of this are: 1) the problems of this type are inherently multicriteria in the sense that improving one performance index might result in compromising the other important measures; 2) practical systems often exhibit unknown complex dynamics and several interconnections which make the modeling task difficult; and 3) as the models are acquired from the existing historical data, they are valid only locally and extrapolations incorporate risk of increasing process variability. To overcome these difficulties, this paper presents a new decision support system for robust multiobjective optimization of interconnected processes. The plant is first divided into serially connected units to model the process, product quality, energy consumption, and corresponding uncertainty measures. Then multiobjective gradient descent algorithm is used to solve the problem in line with user's preference information. Finally, the optimization results are visualized for analysis and decision making. In practice, if further iterations of the optimization algorithm are considered, validity of the local models must be checked prior to proceeding to further iterations. The method is implemented by a MATLAB-based interactive tool DataExplorer supporting a range of data analysis, modeling, and multiobjective optimization techniques. The proposed approach was tested in two U.K.-based commercial paper mills where the aim was reducing steam consumption and increasing productivity while maintaining the product quality by optimization of vacuum pressures in forming and press sections. The experimental results demonstrate the effectiveness of the method.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2324-38"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2174444","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30307607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-11-28DOI: 10.1109/TNN.2011.2175748
Huaguang Zhang, Jinhai Liu, Dazhong Ma, Zhanshan Wang
A fuzzy min-max neural network based on data core (DCFMN) is proposed for pattern classification. A new membership function for classifying the neuron of DCFMN is defined in which the noise, the geometric center of the hyperbox, and the data core are considered. Instead of using the contraction process of the FMNN described by Simpson, a kind of overlapped neuron with new membership function based on the data core is proposed and added to neural network to represent the overlapping area of hyperboxes belonging to different classes. Furthermore, some algorithms of online learning and classification are presented according to the structure of DCFMN. DCFMN has strong robustness and high accuracy in classification taking onto account the effect of data core and noise. The performance of DCFMN is checked by some benchmark datasets and compared with some traditional fuzzy neural networks, such as the fuzzy min-max neural network (FMNN), the general FMNN, and the FMNN with compensatory neuron. Finally the pattern classification of a pipeline is evaluated using DCFMN and other classifiers. All the results indicate that the performance of DCFMN is excellent.
{"title":"Data-core-based fuzzy min-max neural network for pattern classification.","authors":"Huaguang Zhang, Jinhai Liu, Dazhong Ma, Zhanshan Wang","doi":"10.1109/TNN.2011.2175748","DOIUrl":"https://doi.org/10.1109/TNN.2011.2175748","url":null,"abstract":"<p><p>A fuzzy min-max neural network based on data core (DCFMN) is proposed for pattern classification. A new membership function for classifying the neuron of DCFMN is defined in which the noise, the geometric center of the hyperbox, and the data core are considered. Instead of using the contraction process of the FMNN described by Simpson, a kind of overlapped neuron with new membership function based on the data core is proposed and added to neural network to represent the overlapping area of hyperboxes belonging to different classes. Furthermore, some algorithms of online learning and classification are presented according to the structure of DCFMN. DCFMN has strong robustness and high accuracy in classification taking onto account the effect of data core and noise. The performance of DCFMN is checked by some benchmark datasets and compared with some traditional fuzzy neural networks, such as the fuzzy min-max neural network (FMNN), the general FMNN, and the FMNN with compensatory neuron. Finally the pattern classification of a pipeline is evaluated using DCFMN and other classifiers. All the results indicate that the performance of DCFMN is excellent.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2339-52"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2175748","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30307608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-12-05DOI: 10.1109/TNN.2011.2176541
Sajad Saeedi, Liam Paull, Michael Trentini, Howard Li
In this paper, a decentralized platform for simultaneous localization and mapping (SLAM) with multiple robots is developed. Each robot performs single robot view-based SLAM using an extended Kalman filter to fuse data from two encoders and a laser ranger. To extend this approach to multiple robot SLAM, a novel occupancy grid map fusion algorithm is proposed. Map fusion is achieved through a multistep process that includes image preprocessing, map learning (clustering) using neural networks, relative orientation extraction using norm histogram cross correlation and a Radon transform, relative translation extraction using matching norm vectors, and then verification of the results. The proposed map learning method is a process based on the self-organizing map. In the learning phase, the obstacles of the map are learned by clustering the occupied cells of the map into clusters. The learning is an unsupervised process which can be done on the fly without any need to have output training patterns. The clusters represent the spatial form of the map and make further analyses of the map easier and faster. Also, clusters can be interpreted as features extracted from the occupancy grid map so the map fusion problem becomes a task of matching features. Results of the experiments from tests performed on a real environment with multiple robots prove the effectiveness of the proposed solution.
{"title":"Neural network-based multiple robot simultaneous localization and mapping.","authors":"Sajad Saeedi, Liam Paull, Michael Trentini, Howard Li","doi":"10.1109/TNN.2011.2176541","DOIUrl":"https://doi.org/10.1109/TNN.2011.2176541","url":null,"abstract":"<p><p>In this paper, a decentralized platform for simultaneous localization and mapping (SLAM) with multiple robots is developed. Each robot performs single robot view-based SLAM using an extended Kalman filter to fuse data from two encoders and a laser ranger. To extend this approach to multiple robot SLAM, a novel occupancy grid map fusion algorithm is proposed. Map fusion is achieved through a multistep process that includes image preprocessing, map learning (clustering) using neural networks, relative orientation extraction using norm histogram cross correlation and a Radon transform, relative translation extraction using matching norm vectors, and then verification of the results. The proposed map learning method is a process based on the self-organizing map. In the learning phase, the obstacles of the map are learned by clustering the occupied cells of the map into clusters. The learning is an unsupervised process which can be done on the fly without any need to have output training patterns. The clusters represent the spatial form of the map and make further analyses of the map easier and faster. Also, clusters can be interpreted as features extracted from the occupancy grid map so the map fusion problem becomes a task of matching features. Results of the experiments from tests performed on a real environment with multiple robots prove the effectiveness of the proposed solution.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2376-87"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2176541","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30315366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, a novel heuristic dynamic programming (HDP) iteration algorithm is proposed to solve the optimal tracking control problem for a class of nonlinear discrete-time systems with time delays. The novel algorithm contains state updating, control policy iteration, and performance index iteration. To get the optimal states, the states are also updated. Furthermore, the "backward iteration" is applied to state updating. Two neural networks are used to approximate the performance index function and compute the optimal control policy for facilitating the implementation of HDP iteration algorithm. At last, we present two examples to demonstrate the effectiveness of the proposed HDP iteration algorithm.
{"title":"Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming.","authors":"Huaguang Zhang, Ruizhuo Song, Qinglai Wei, Tieyan Zhang","doi":"10.1109/TNN.2011.2172628","DOIUrl":"https://doi.org/10.1109/TNN.2011.2172628","url":null,"abstract":"<p><p>In this paper, a novel heuristic dynamic programming (HDP) iteration algorithm is proposed to solve the optimal tracking control problem for a class of nonlinear discrete-time systems with time delays. The novel algorithm contains state updating, control policy iteration, and performance index iteration. To get the optimal states, the states are also updated. Furthermore, the \"backward iteration\" is applied to state updating. Two neural networks are used to approximate the performance index function and compute the optimal control policy for facilitating the implementation of HDP iteration algorithm. At last, we present two examples to demonstrate the effectiveness of the proposed HDP iteration algorithm.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":" ","pages":"1851-62"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2172628","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40131705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-10-06DOI: 10.1109/TNN.2011.2169425
Tao Li, Wei Xing Zheng, Chong Lin
By using the fact that the neuron activation functions are sector bounded and nondecreasing, this brief presents a new method, named the delay-slope-dependent method, for stability analysis of a class of recurrent neural networks with time-varying delays. This method includes more information on the slope of neuron activation functions and fewer matrix variables in the constructed Lyapunov-Krasovskii functional. Then some improved delay-dependent stability criteria with less computational burden and conservatism are obtained. Numerical examples are given to illustrate the effectiveness and the benefits of the proposed method.
{"title":"Delay-slope-dependent stability results of recurrent neural networks.","authors":"Tao Li, Wei Xing Zheng, Chong Lin","doi":"10.1109/TNN.2011.2169425","DOIUrl":"https://doi.org/10.1109/TNN.2011.2169425","url":null,"abstract":"<p><p>By using the fact that the neuron activation functions are sector bounded and nondecreasing, this brief presents a new method, named the delay-slope-dependent method, for stability analysis of a class of recurrent neural networks with time-varying delays. This method includes more information on the slope of neuron activation functions and fewer matrix variables in the constructed Lyapunov-Krasovskii functional. Then some improved delay-dependent stability criteria with less computational burden and conservatism are obtained. Numerical examples are given to illustrate the effectiveness and the benefits of the proposed method.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2138-43"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2169425","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30196887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-09-26DOI: 10.1109/TNN.2011.2167630
Xiaohui Lu, Hong Chen, Ping Wang, Bingzhao Gao
In this paper, a data-driven predictive controller is designed for the start-up process of vehicles with automated manual transmissions (AMTs). It is obtained directly from the input-output data of a driveline simulation model constructed by the commercial software AMESim. In order to obtain offset-free control for the reference input, the predictor equation is gained with incremental inputs and outputs. Because of the physical characteristics, the input and output constraints are considered explicitly in the problem formulation. The contradictory requirements of less friction losses and less driveline shock are included in the objective function. The designed controller is tested under nominal conditions and changed conditions. The simulation results show that, during the start-up process, the AMT clutch with the proposed controller works very well, and the process meets the control objectives: fast clutch lockup time, small friction losses, and the preservation of driver comfort, i.e., smooth acceleration of the vehicle. At the same time, the closed-loop system has the ability to reject uncertainties, such as the vehicle mass and road grade.
{"title":"Design of a data-driven predictive controller for start-up process of AMT vehicles.","authors":"Xiaohui Lu, Hong Chen, Ping Wang, Bingzhao Gao","doi":"10.1109/TNN.2011.2167630","DOIUrl":"https://doi.org/10.1109/TNN.2011.2167630","url":null,"abstract":"<p><p>In this paper, a data-driven predictive controller is designed for the start-up process of vehicles with automated manual transmissions (AMTs). It is obtained directly from the input-output data of a driveline simulation model constructed by the commercial software AMESim. In order to obtain offset-free control for the reference input, the predictor equation is gained with incremental inputs and outputs. Because of the physical characteristics, the input and output constraints are considered explicitly in the problem formulation. The contradictory requirements of less friction losses and less driveline shock are included in the objective function. The designed controller is tested under nominal conditions and changed conditions. The simulation results show that, during the start-up process, the AMT clutch with the proposed controller works very well, and the process meets the control objectives: fast clutch lockup time, small friction losses, and the preservation of driver comfort, i.e., smooth acceleration of the vehicle. At the same time, the closed-loop system has the ability to reject uncertainties, such as the vehicle mass and road grade.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2201-12"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2167630","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30026909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-10-28DOI: 10.1109/TNN.2011.2169809
Rafał Długosz, Marta Kolasa, Witold Pedrycz, Michał Szulc
We present a new programmable neighborhood mechanism for hardware implemented Kohonen self-organizing maps (SOMs) with three different map topologies realized on a single chip. The proposed circuit comes as a fully parallel and asynchronous architecture. The mechanism is very fast. In a medium sized map with several hundreds neurons implemented in the complementary metal-oxide semiconductor 0.18 μm technology, all neurons start adapting the weights after no more than 11 ns. The adaptation is then carried out in parallel. This is an evident advantage in comparison with the commonly used software-realized SOMs. The circuit is robust against the process, supply voltage and environment temperature variations. Due to a simple structure, it features low energy consumption of a few pJ per neuron per a single learning pattern. In this paper, we discuss different aspects of hardware realization, such as a suitable selection of the map topology and the initial neighborhood range, as the optimization of these parameters is essential when looking from the circuit complexity point of view. For the optimal values of these parameters, the chip area and the power dissipation can be reduced even by 60% and 80%, respectively, without affecting the quality of learning.
{"title":"Parallel programmable asynchronous neighborhood mechanism for Kohonen SOM implemented in CMOS technology.","authors":"Rafał Długosz, Marta Kolasa, Witold Pedrycz, Michał Szulc","doi":"10.1109/TNN.2011.2169809","DOIUrl":"https://doi.org/10.1109/TNN.2011.2169809","url":null,"abstract":"<p><p>We present a new programmable neighborhood mechanism for hardware implemented Kohonen self-organizing maps (SOMs) with three different map topologies realized on a single chip. The proposed circuit comes as a fully parallel and asynchronous architecture. The mechanism is very fast. In a medium sized map with several hundreds neurons implemented in the complementary metal-oxide semiconductor 0.18 μm technology, all neurons start adapting the weights after no more than 11 ns. The adaptation is then carried out in parallel. This is an evident advantage in comparison with the commonly used software-realized SOMs. The circuit is robust against the process, supply voltage and environment temperature variations. Due to a simple structure, it features low energy consumption of a few pJ per neuron per a single learning pattern. In this paper, we discuss different aspects of hardware realization, such as a suitable selection of the map topology and the initial neighborhood range, as the optimization of these parameters is essential when looking from the circuit complexity point of view. For the optimal values of these parameters, the chip area and the power dissipation can be reduced even by 60% and 80%, respectively, without affecting the quality of learning.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":" ","pages":"2091-104"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2169809","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40123923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-11-04DOI: 10.1109/TNN.2011.2173502
Keem Siah Yap, Chee Peng Lim, Mau Teng Au
Generalized adaptive resonance theory (GART) is a neural network model that is capable of online learning and is effective in tackling pattern classification tasks. In this paper, we propose an improved GART model (IGART), and demonstrate its applicability to power systems. IGART enhances the dynamics of GART in several aspects, which include the use of the Laplacian likelihood function, a new vigilance function, a new match-tracking mechanism, an ordering algorithm for determining the sequence of training data, and a rule extraction capability to elicit if-then rules from the network. To assess the effectiveness of IGART and to compare its performances with those from other methods, three datasets that are related to power systems are employed. The experimental results demonstrate the usefulness of IGART with the rule extraction capability in undertaking classification problems in power systems engineering.
{"title":"Improved GART neural network model for pattern classification and rule extraction with application to power systems.","authors":"Keem Siah Yap, Chee Peng Lim, Mau Teng Au","doi":"10.1109/TNN.2011.2173502","DOIUrl":"https://doi.org/10.1109/TNN.2011.2173502","url":null,"abstract":"<p><p>Generalized adaptive resonance theory (GART) is a neural network model that is capable of online learning and is effective in tackling pattern classification tasks. In this paper, we propose an improved GART model (IGART), and demonstrate its applicability to power systems. IGART enhances the dynamics of GART in several aspects, which include the use of the Laplacian likelihood function, a new vigilance function, a new match-tracking mechanism, an ordering algorithm for determining the sequence of training data, and a rule extraction capability to elicit if-then rules from the network. To assess the effectiveness of IGART and to compare its performances with those from other methods, three datasets that are related to power systems are employed. The experimental results demonstrate the usefulness of IGART with the rule extraction capability in undertaking classification problems in power systems engineering.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":" ","pages":"2310-23"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2173502","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40139885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-12-01Epub Date: 2011-02-17DOI: 10.1109/TNN.2011.2106163
Jan Chorowski, Jacek M Zurada
Rule extraction from neural networks (NNs) solves two fundamental problems: it gives insight into the logic behind the network and in many cases, it improves the network's ability to generalize the acquired knowledge. This paper presents a novel eclectic approach to rule extraction from NNs, named LOcal Rule Extraction (LORE), suited for multilayer perceptron networks with discrete (logical or categorical) inputs. The extracted rules mimic network behavior on the training set and relax this condition on the remaining input space. First, a multilayer perceptron network is trained under standard regime. It is then transformed into an equivalent form, returning the same numerical result as the original network, yet being able to produce rules generalizing the network output for cases similar to a given input. The partial rules extracted for every training set sample are then merged to form a decision diagram (DD) from which logic rules can be extracted. A rule format explicitly separating subsets of inputs for which an answer is known from those with an undetermined answer is presented. A special data structure, the decision diagram, allowing efficient partial rule merging is introduced. With regard to rules' complexity and generalization abilities, LORE gives results comparable to those reported previously. An algorithm transforming DDs into interpretable boolean expressions is described. Experimental running times of rule extraction are proportional to the network's training time.
{"title":"Extracting rules from neural networks as decision diagrams.","authors":"Jan Chorowski, Jacek M Zurada","doi":"10.1109/TNN.2011.2106163","DOIUrl":"https://doi.org/10.1109/TNN.2011.2106163","url":null,"abstract":"<p><p>Rule extraction from neural networks (NNs) solves two fundamental problems: it gives insight into the logic behind the network and in many cases, it improves the network's ability to generalize the acquired knowledge. This paper presents a novel eclectic approach to rule extraction from NNs, named LOcal Rule Extraction (LORE), suited for multilayer perceptron networks with discrete (logical or categorical) inputs. The extracted rules mimic network behavior on the training set and relax this condition on the remaining input space. First, a multilayer perceptron network is trained under standard regime. It is then transformed into an equivalent form, returning the same numerical result as the original network, yet being able to produce rules generalizing the network output for cases similar to a given input. The partial rules extracted for every training set sample are then merged to form a decision diagram (DD) from which logic rules can be extracted. A rule format explicitly separating subsets of inputs for which an answer is known from those with an undetermined answer is presented. A special data structure, the decision diagram, allowing efficient partial rule merging is introduced. With regard to rules' complexity and generalization abilities, LORE gives results comparable to those reported previously. An algorithm transforming DDs into interpretable boolean expressions is described. Experimental running times of rule extraction are proportional to the network's training time.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 12","pages":"2435-46"},"PeriodicalIF":0.0,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2106163","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29684469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}