Pub Date : 2011-11-01Epub Date: 2011-09-29DOI: 10.1109/TNN.2011.2166561
Yajun Zhang, Tianyou Chai, Hong Wang
This paper presents a novel nonlinear control strategy for a class of uncertain single-input and single-output discrete-time nonlinear systems with unstable zero-dynamics. The proposed method combines adaptive-network-based fuzzy inference system (ANFIS) with multiple models, where a linear robust controller, an ANFIS-based nonlinear controller and a switching mechanism are integrated using multiple models technique. It has been shown that the linear controller can ensure the boundedness of the input and output signals and the nonlinear controller can improve the dynamic performance of the closed loop system. Moreover, it has also been shown that the use of the switching mechanism can simultaneously guarantee the closed loop stability and improve its performance. As a result, the controller has the following three outstanding features compared with existing control strategies. First, this method relaxes the assumption of commonly-used uniform boundedness on the unmodeled dynamics and thus enhances its applicability. Second, since ANFIS is used to estimate and compensate the effect caused by the unmodeled dynamics, the convergence rate of neural network learning has been increased. Third, a "one-to-one mapping" technique is adapted to guarantee the universal approximation property of ANFIS. The proposed controller is applied to a numerical example and a pulverizing process of an alumina sintering system, respectively, where its effectiveness has been justified.
{"title":"A nonlinear control method based on ANFIS and multiple models for a class of SISO nonlinear systems and its application.","authors":"Yajun Zhang, Tianyou Chai, Hong Wang","doi":"10.1109/TNN.2011.2166561","DOIUrl":"https://doi.org/10.1109/TNN.2011.2166561","url":null,"abstract":"<p><p>This paper presents a novel nonlinear control strategy for a class of uncertain single-input and single-output discrete-time nonlinear systems with unstable zero-dynamics. The proposed method combines adaptive-network-based fuzzy inference system (ANFIS) with multiple models, where a linear robust controller, an ANFIS-based nonlinear controller and a switching mechanism are integrated using multiple models technique. It has been shown that the linear controller can ensure the boundedness of the input and output signals and the nonlinear controller can improve the dynamic performance of the closed loop system. Moreover, it has also been shown that the use of the switching mechanism can simultaneously guarantee the closed loop stability and improve its performance. As a result, the controller has the following three outstanding features compared with existing control strategies. First, this method relaxes the assumption of commonly-used uniform boundedness on the unmodeled dynamics and thus enhances its applicability. Second, since ANFIS is used to estimate and compensate the effect caused by the unmodeled dynamics, the convergence rate of neural network learning has been increased. Third, a \"one-to-one mapping\" technique is adapted to guarantee the universal approximation property of ANFIS. The proposed controller is applied to a numerical example and a pulverizing process of an alumina sintering system, respectively, where its effectiveness has been justified.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1783-95"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2166561","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30181108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01Epub Date: 2011-09-26DOI: 10.1109/TNN.2011.2167240
Ke Chen, Ahmad Salman
Speech signals convey various yet mixed information ranging from linguistic to speaker-specific information. However, most of acoustic representations characterize all different kinds of information as whole, which could hinder either a speech or a speaker recognition (SR) system from producing a better performance. In this paper, we propose a novel deep neural architecture (DNA) especially for learning speaker-specific characteristics from mel-frequency cepstral coefficients, an acoustic representation commonly used in both speech recognition and SR, which results in a speaker-specific overcomplete representation. In order to learn intrinsic speaker-specific characteristics, we come up with an objective function consisting of contrastive losses in terms of speaker similarity/dissimilarity and data reconstruction losses used as regularization to normalize the interference of non-speaker-related information. Moreover, we employ a hybrid learning strategy for learning parameters of the deep neural networks: i.e., local yet greedy layerwise unsupervised pretraining for initialization and global supervised learning for the ultimate discriminative goal. With four Linguistic Data Consortium (LDC) benchmarks and two non-English corpora, we demonstrate that our overcomplete representation is robust in characterizing various speakers, no matter whether their utterances have been used in training our DNA, and highly insensitive to text and languages spoken. Extensive comparative studies suggest that our approach yields favorite results in speaker verification and segmentation. Finally, we discuss several issues concerning our proposed approach.
{"title":"Learning speaker-specific characteristics with a deep neural architecture.","authors":"Ke Chen, Ahmad Salman","doi":"10.1109/TNN.2011.2167240","DOIUrl":"https://doi.org/10.1109/TNN.2011.2167240","url":null,"abstract":"<p><p>Speech signals convey various yet mixed information ranging from linguistic to speaker-specific information. However, most of acoustic representations characterize all different kinds of information as whole, which could hinder either a speech or a speaker recognition (SR) system from producing a better performance. In this paper, we propose a novel deep neural architecture (DNA) especially for learning speaker-specific characteristics from mel-frequency cepstral coefficients, an acoustic representation commonly used in both speech recognition and SR, which results in a speaker-specific overcomplete representation. In order to learn intrinsic speaker-specific characteristics, we come up with an objective function consisting of contrastive losses in terms of speaker similarity/dissimilarity and data reconstruction losses used as regularization to normalize the interference of non-speaker-related information. Moreover, we employ a hybrid learning strategy for learning parameters of the deep neural networks: i.e., local yet greedy layerwise unsupervised pretraining for initialization and global supervised learning for the ultimate discriminative goal. With four Linguistic Data Consortium (LDC) benchmarks and two non-English corpora, we demonstrate that our overcomplete representation is robust in characterizing various speakers, no matter whether their utterances have been used in training our DNA, and highly insensitive to text and languages spoken. Extensive comparative studies suggest that our approach yields favorite results in speaker verification and segmentation. Finally, we discuss several issues concerning our proposed approach.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1744-56"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2167240","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30026908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01Epub Date: 2011-09-06DOI: 10.1109/TNN.2011.2165556
Fang-Xiang Wu
Genetic regulatory networks can be described by nonlinear differential equations with time delays. In this paper, we study both locally and globally delay-independent stability of genetic regulatory networks, taking messenger ribonucleic acid alternative splicing into consideration. Based on nonnegative matrix theory, we first develop necessary and sufficient conditions for locally delay-independent stability of genetic regulatory networks with multiple time delays. Compared to the previous results, these conditions are easy to verify. Then we develop sufficient conditions for global delay-independent stability for genetic regulatory networks. Compared to the previous results, this sufficient condition is less conservative. To illustrate theorems developed in this paper, we analyze delay-independent stability of two genetic regulatory networks: a real-life repressilatory network with three genes and three proteins, and a synthetic gene regulatory network with five genes and seven proteins. The simulation results show that the theorems developed in this paper can effectively determine the delay-independent stability of genetic regulatory networks.
{"title":"Delay-independent stability of genetic regulatory networks.","authors":"Fang-Xiang Wu","doi":"10.1109/TNN.2011.2165556","DOIUrl":"https://doi.org/10.1109/TNN.2011.2165556","url":null,"abstract":"<p><p>Genetic regulatory networks can be described by nonlinear differential equations with time delays. In this paper, we study both locally and globally delay-independent stability of genetic regulatory networks, taking messenger ribonucleic acid alternative splicing into consideration. Based on nonnegative matrix theory, we first develop necessary and sufficient conditions for locally delay-independent stability of genetic regulatory networks with multiple time delays. Compared to the previous results, these conditions are easy to verify. Then we develop sufficient conditions for global delay-independent stability for genetic regulatory networks. Compared to the previous results, this sufficient condition is less conservative. To illustrate theorems developed in this paper, we analyze delay-independent stability of two genetic regulatory networks: a real-life repressilatory network with three genes and three proteins, and a synthetic gene regulatory network with five genes and seven proteins. The simulation results show that the theorems developed in this paper can effectively determine the delay-independent stability of genetic regulatory networks.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1685-93"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2165556","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30125870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01Epub Date: 2011-09-29DOI: 10.1109/TNN.2011.2160968
Shahab Mehraeen, Sarangapani Jagannathan
In this paper, the direct neural dynamic programming technique is utilized to solve the Hamilton-Jacobi-Bellman equation forward-in-time for the decentralized near optimal regulation of a class of nonlinear interconnected discrete-time systems with unknown internal subsystem and interconnection dynamics, while the input gain matrix is considered known. Even though the unknown interconnection terms are considered weak and functions of the entire state vector, the decentralized control is attempted under the assumption that only the local state vector is measurable. The decentralized nearly optimal controller design for each subsystem consists of two neural networks (NNs), an action NN that is aimed to provide a nearly optimal control signal, and a critic NN which evaluates the performance of the overall system. All NN parameters are tuned online for both the NNs. By using Lyapunov techniques it is shown that all subsystems signals are uniformly ultimately bounded and that the synthesized subsystems inputs approach their corresponding nearly optimal control inputs with bounded error. Simulation results are included to show the effectiveness of the approach.
{"title":"Decentralized optimal control of a class of interconnected nonlinear discrete-time systems by using online Hamilton-Jacobi-Bellman formulation.","authors":"Shahab Mehraeen, Sarangapani Jagannathan","doi":"10.1109/TNN.2011.2160968","DOIUrl":"https://doi.org/10.1109/TNN.2011.2160968","url":null,"abstract":"<p><p>In this paper, the direct neural dynamic programming technique is utilized to solve the Hamilton-Jacobi-Bellman equation forward-in-time for the decentralized near optimal regulation of a class of nonlinear interconnected discrete-time systems with unknown internal subsystem and interconnection dynamics, while the input gain matrix is considered known. Even though the unknown interconnection terms are considered weak and functions of the entire state vector, the decentralized control is attempted under the assumption that only the local state vector is measurable. The decentralized nearly optimal controller design for each subsystem consists of two neural networks (NNs), an action NN that is aimed to provide a nearly optimal control signal, and a critic NN which evaluates the performance of the overall system. All NN parameters are tuned online for both the NNs. By using Lyapunov techniques it is shown that all subsystems signals are uniformly ultimately bounded and that the synthesized subsystems inputs approach their corresponding nearly optimal control inputs with bounded error. Simulation results are included to show the effectiveness of the approach.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1757-69"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2160968","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30181105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01Epub Date: 2011-10-03DOI: 10.1109/TNN.2011.2166979
Lan-Da Van, Di-You Wu, Chien-Shiun Chen
This paper presents an energy-efficient fast independent component analysis (FastICA) implementation with an early determination scheme for eight-channel electroencephalogram (EEG) signal separation. The main contributions are as follows: (1) energy-efficient FastICA using the proposed early determination scheme and the corresponding architecture; (2) cost-effective FastICA using the proposed preprocessing unit architecture with one coordinate rotation digital computer-based eigenvalue decomposition processor and the proposed one-unit architecture with the hardware reuse scheme; and (3) low-computation-time FastICA using the four parallel one-units architecture. The resulting power dissipation of the FastICA implementation for eight-channel EEG signal separation is 16.35 mW at 100 MHz at 1.0 V. Compared with the design without early determination, the proposed FastICA architecture implemented in united microelectronics corporation 90 nm 1P9M complementary metal-oxide-semiconductor process with a core area of 1.221 × 1.218 mm2 can achieve average energy reduction by 47.63%. From the post-layout simulation results, the maximum computation time is 0.29 s.
{"title":"Energy-efficient FastICA implementation for biomedical signal separation.","authors":"Lan-Da Van, Di-You Wu, Chien-Shiun Chen","doi":"10.1109/TNN.2011.2166979","DOIUrl":"https://doi.org/10.1109/TNN.2011.2166979","url":null,"abstract":"<p><p>This paper presents an energy-efficient fast independent component analysis (FastICA) implementation with an early determination scheme for eight-channel electroencephalogram (EEG) signal separation. The main contributions are as follows: (1) energy-efficient FastICA using the proposed early determination scheme and the corresponding architecture; (2) cost-effective FastICA using the proposed preprocessing unit architecture with one coordinate rotation digital computer-based eigenvalue decomposition processor and the proposed one-unit architecture with the hardware reuse scheme; and (3) low-computation-time FastICA using the four parallel one-units architecture. The resulting power dissipation of the FastICA implementation for eight-channel EEG signal separation is 16.35 mW at 100 MHz at 1.0 V. Compared with the design without early determination, the proposed FastICA architecture implemented in united microelectronics corporation 90 nm 1P9M complementary metal-oxide-semiconductor process with a core area of 1.221 × 1.218 mm2 can achieve average energy reduction by 47.63%. From the post-layout simulation results, the maximum computation time is 0.29 s.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1809-22"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2166979","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30182517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01Epub Date: 2011-10-06DOI: 10.1109/TNN.2011.2169086
Manuel Fernandez-Delgado, Jorge Ribeiro, Eva Cernadas, Senén Barro Ameneiro
Parallel perceptrons (PPs) are very simple and efficient committee machines (a single layer of perceptrons with threshold activation functions and binary outputs, and a majority voting decision scheme), which nevertheless behave as universal approximators. The parallel delta (P-Delta) rule is an effective training algorithm, which, following the ideas of statistical learning theory used by the support vector machine (SVM), raises its generalization ability by maximizing the difference between the perceptron activations for the training patterns and the activation threshold (which corresponds to the separating hyperplane). In this paper, we propose an analytical closed-form expression to calculate the PPs' weights for classification tasks. Our method, called Direct Parallel Perceptrons (DPPs), directly calculates (without iterations) the weights using the training patterns and their desired outputs, without any search or numeric function optimization. The calculated weights globally minimize an error function which simultaneously takes into account the training error and the classification margin. Given its analytical and noniterative nature, DPPs are computationally much more efficient than other related approaches (P-Delta and SVM), and its computational complexity is linear in the input dimensionality. Therefore, DPPs are very appealing, in terms of time complexity and memory consumption, and are very easy to use for high-dimensional classification tasks. On real benchmark datasets with two and multiple classes, DPPs are competitive with SVM and other approaches but they also allow online learning and, as opposed to most of them, have no tunable parameters.
{"title":"Direct parallel perceptrons (DPPs): fast analytical calculation of the parallel perceptrons weights with margin control for classification tasks.","authors":"Manuel Fernandez-Delgado, Jorge Ribeiro, Eva Cernadas, Senén Barro Ameneiro","doi":"10.1109/TNN.2011.2169086","DOIUrl":"https://doi.org/10.1109/TNN.2011.2169086","url":null,"abstract":"<p><p>Parallel perceptrons (PPs) are very simple and efficient committee machines (a single layer of perceptrons with threshold activation functions and binary outputs, and a majority voting decision scheme), which nevertheless behave as universal approximators. The parallel delta (P-Delta) rule is an effective training algorithm, which, following the ideas of statistical learning theory used by the support vector machine (SVM), raises its generalization ability by maximizing the difference between the perceptron activations for the training patterns and the activation threshold (which corresponds to the separating hyperplane). In this paper, we propose an analytical closed-form expression to calculate the PPs' weights for classification tasks. Our method, called Direct Parallel Perceptrons (DPPs), directly calculates (without iterations) the weights using the training patterns and their desired outputs, without any search or numeric function optimization. The calculated weights globally minimize an error function which simultaneously takes into account the training error and the classification margin. Given its analytical and noniterative nature, DPPs are computationally much more efficient than other related approaches (P-Delta and SVM), and its computational complexity is linear in the input dimensionality. Therefore, DPPs are very appealing, in terms of time complexity and memory consumption, and are very easy to use for high-dimensional classification tasks. On real benchmark datasets with two and multiple classes, DPPs are competitive with SVM and other approaches but they also allow online learning and, as opposed to most of them, have no tunable parameters.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1837-48"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2169086","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30196886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-11-01Epub Date: 2011-09-06DOI: 10.1109/TNN.2011.2164934
Xiaobing Nie, Jinde Cao
In this paper, second-order interactions are introduced into competitive neural networks (NNs) and the multistability is discussed for second-order competitive NNs (SOCNNs) with nondecreasing saturated activation functions. Firstly, based on decomposition of state space, Cauchy convergence principle, and inequality technique, some sufficient conditions ensuring the local exponential stability of 2N equilibrium points are derived. Secondly, some conditions are obtained for ascertaining equilibrium points to be locally exponentially stable and to be located in any designated region. Thirdly, the theory is extended to more general saturated activation functions with 2r corner points and a sufficient criterion is given under which the SOCNNs can have (r+1)N locally exponentially stable equilibrium points. Even if there is no second-order interactions, the obtained results are less restrictive than those in some recent works. Finally, three examples with their simulations are presented to verify the theoretical analysis.
{"title":"Multistability of second-order competitive neural networks with nondecreasing saturated activation functions.","authors":"Xiaobing Nie, Jinde Cao","doi":"10.1109/TNN.2011.2164934","DOIUrl":"https://doi.org/10.1109/TNN.2011.2164934","url":null,"abstract":"<p><p>In this paper, second-order interactions are introduced into competitive neural networks (NNs) and the multistability is discussed for second-order competitive NNs (SOCNNs) with nondecreasing saturated activation functions. Firstly, based on decomposition of state space, Cauchy convergence principle, and inequality technique, some sufficient conditions ensuring the local exponential stability of 2N equilibrium points are derived. Secondly, some conditions are obtained for ascertaining equilibrium points to be locally exponentially stable and to be located in any designated region. Thirdly, the theory is extended to more general saturated activation functions with 2r corner points and a sufficient criterion is given under which the SOCNNs can have (r+1)N locally exponentially stable equilibrium points. Even if there is no second-order interactions, the obtained results are less restrictive than those in some recent works. Finally, three examples with their simulations are presented to verify the theoretical analysis.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 11","pages":"1694-708"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2164934","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30127259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01Epub Date: 2011-08-30DOI: 10.1109/TNN.2011.2164265
Masayuki Karasuyama, Ichiro Takeuchi
Regularization path algorithms have been proposed to deal with model selection problem in several machine learning approaches. These algorithms allow computation of the entire path of solutions for every value of regularization parameter using the fact that their solution paths have piecewise linear form. In this paper, we extend the applicability of regularization path algorithm to a class of learning machines that have quadratic loss and quadratic penalty term. This class contains several important learning machines such as squared hinge loss support vector machine (SVM) and modified Huber loss SVM. We first show that the solution paths of this class of learning machines have piecewise nonlinear form, and piecewise segments between two breakpoints are characterized by a class of rational functions. Then we develop an algorithm that can efficiently follow the piecewise nonlinear path by solving these rational equations. To solve these rational equations, we use rational approximation technique with quadratic convergence rate, and thus, our algorithm can follow the nonlinear path much more precisely than existing approaches such as predictor-corrector type nonlinear-path approximation. We show the algorithm performance on some artificial and real data sets.
正则化路径算法已被提出用于处理几种机器学习方法中的模型选择问题。这些算法利用其解路径具有分段线性形式的事实,允许计算每个正则化参数值的整个解路径。本文将正则化路径算法的适用性扩展到一类具有二次损失和二次惩罚项的学习机。本课程包含了几种重要的学习机,如平方铰链损失支持向量机(squared hinge loss support vector machine, SVM)和改进Huber损失支持向量机。我们首先证明了这类学习机的解路径具有分段非线性形式,并且两个断点之间的分段用一类有理函数来表征。然后通过求解这些有理方程,提出了一种能有效跟踪分段非线性路径的算法。为了求解这些有理数方程,我们使用二次收敛率的有理数近似技术,因此,我们的算法可以比现有的预测-校正型非线性路径近似方法更精确地跟踪非线性路径。我们在一些人工数据集和真实数据集上展示了算法的性能。
{"title":"Nonlinear regularization path for quadratic loss support vector machines.","authors":"Masayuki Karasuyama, Ichiro Takeuchi","doi":"10.1109/TNN.2011.2164265","DOIUrl":"https://doi.org/10.1109/TNN.2011.2164265","url":null,"abstract":"<p><p>Regularization path algorithms have been proposed to deal with model selection problem in several machine learning approaches. These algorithms allow computation of the entire path of solutions for every value of regularization parameter using the fact that their solution paths have piecewise linear form. In this paper, we extend the applicability of regularization path algorithm to a class of learning machines that have quadratic loss and quadratic penalty term. This class contains several important learning machines such as squared hinge loss support vector machine (SVM) and modified Huber loss SVM. We first show that the solution paths of this class of learning machines have piecewise nonlinear form, and piecewise segments between two breakpoints are characterized by a class of rational functions. Then we develop an algorithm that can efficiently follow the piecewise nonlinear path by solving these rational equations. To solve these rational equations, we use rational approximation technique with quadratic convergence rate, and thus, our algorithm can follow the nonlinear path much more precisely than existing approaches such as predictor-corrector type nonlinear-path approximation. We show the algorithm performance on some artificial and real data sets.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 10","pages":"1613-25"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2164265","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30110076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01Epub Date: 2011-08-04DOI: 10.1109/TNN.2011.2161999
Haijun Zhang, Gang Liu, Tommy W S Chow, Wenyin Liu
A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases.
{"title":"Textual and visual content-based anti-phishing: a Bayesian approach.","authors":"Haijun Zhang, Gang Liu, Tommy W S Chow, Wenyin Liu","doi":"10.1109/TNN.2011.2161999","DOIUrl":"https://doi.org/10.1109/TNN.2011.2161999","url":null,"abstract":"<p><p>A novel framework using a Bayesian approach for content-based phishing web page detection is presented. Our model takes into account textual and visual contents to measure the similarity between the protected web page and suspicious web pages. A text classifier, an image classifier, and an algorithm fusing the results from classifiers are introduced. An outstanding feature of this paper is the exploration of a Bayesian model to estimate the matching threshold. This is required in the classifier for determining the class of the web page and identifying whether the web page is phishing or not. In the text classifier, the naive Bayes rule is used to calculate the probability that a web page is phishing. In the image classifier, the earth mover's distance is employed to measure the visual similarity, and our Bayesian model is designed to determine the threshold. In the data fusion algorithm, the Bayes theory is used to synthesize the classification results from textual and visual content. The effectiveness of our proposed approach was examined in a large-scale dataset collected from real phishing cases. Experimental results demonstrated that the text classifier and the image classifier we designed deliver promising results, the fusion algorithm outperforms either of the individual classifiers, and our model can be adapted to different phishing cases.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 10","pages":"1532-46"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2161999","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30063663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2011-10-01Epub Date: 2011-08-22DOI: 10.1109/TNN.2011.2163169
Saman Razavi, Bryan A Tolson
Feedforward neural network is one of the most commonly used function approximation techniques and has been applied to a wide variety of problems arising from various disciplines. However, neural networks are black-box models having multiple challenges/difficulties associated with training and generalization. This paper initially looks into the internal behavior of neural networks and develops a detailed interpretation of the neural network functional geometry. Based on this geometrical interpretation, a new set of variables describing neural networks is proposed as a more effective and geometrically interpretable alternative to the traditional set of network weights and biases. Then, this paper develops a new formulation for neural networks with respect to the newly defined variables; this reformulated neural network (ReNN) is equivalent to the common feedforward neural network but has a less complex error response surface. To demonstrate the learning ability of ReNN, in this paper, two training methods involving a derivative-based (a variation of backpropagation) and a derivative-free optimization algorithms are employed. Moreover, a new measure of regularization on the basis of the developed geometrical interpretation is proposed to evaluate and improve the generalization ability of neural networks. The value of the proposed geometrical interpretation, the ReNN approach, and the new regularization measure are demonstrated across multiple test problems. Results show that ReNN can be trained more effectively and efficiently compared to the common neural networks and the proposed regularization measure is an effective indicator of how a network would perform in terms of generalization.
{"title":"A new formulation for feedforward neural networks.","authors":"Saman Razavi, Bryan A Tolson","doi":"10.1109/TNN.2011.2163169","DOIUrl":"https://doi.org/10.1109/TNN.2011.2163169","url":null,"abstract":"<p><p>Feedforward neural network is one of the most commonly used function approximation techniques and has been applied to a wide variety of problems arising from various disciplines. However, neural networks are black-box models having multiple challenges/difficulties associated with training and generalization. This paper initially looks into the internal behavior of neural networks and develops a detailed interpretation of the neural network functional geometry. Based on this geometrical interpretation, a new set of variables describing neural networks is proposed as a more effective and geometrically interpretable alternative to the traditional set of network weights and biases. Then, this paper develops a new formulation for neural networks with respect to the newly defined variables; this reformulated neural network (ReNN) is equivalent to the common feedforward neural network but has a less complex error response surface. To demonstrate the learning ability of ReNN, in this paper, two training methods involving a derivative-based (a variation of backpropagation) and a derivative-free optimization algorithms are employed. Moreover, a new measure of regularization on the basis of the developed geometrical interpretation is proposed to evaluate and improve the generalization ability of neural networks. The value of the proposed geometrical interpretation, the ReNN approach, and the new regularization measure are demonstrated across multiple test problems. Results show that ReNN can be trained more effectively and efficiently compared to the common neural networks and the proposed regularization measure is an effective indicator of how a network would perform in terms of generalization.</p>","PeriodicalId":13434,"journal":{"name":"IEEE transactions on neural networks","volume":"22 10","pages":"1588-98"},"PeriodicalIF":0.0,"publicationDate":"2011-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TNN.2011.2163169","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30092861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}