Elsevier

Neurocomputing

Volume 167, 1 November 2015, Pages 578-586
Neurocomputing

Common nature of learning between BP-type and Hopfield-type neural networks

https://doi.org/10.1016/j.neucom.2015.04.032Get rights and content

Abstract

Being two famous neural networks, the error back-propagation (BP) algorithm based neural networks (i.e., BP-type neural networks, BPNNs) and Hopfield-type neural networks (HNNs) have been proposed, developed, and investigated extensively for scientific research and engineering applications. They are different from each other in a great deal, in terms of network architecture, physical meaning and training pattern. In this paper of literature-review type, we present in a relatively complete and creative manner the common natures of learning between BP-type and Hopfield-type neural networks for solving various (mathematical) problems. Specifically, comparing the BPNN with the HNN for the same problem-solving task, e.g., matrix inversion as well as function approximation, we show that the BPNN weight-updating formula and the HNN state-transition equation turn out to be essentially the same. Such interesting phenomena promise that, given a neural-network model for a specific problem solving, its potential dual neural-network model can thus be developed.

Introduction

As a branch of artificial intelligence, artificial neural networks (ANNs) have attracted considerable attention as candidates for novel computational systems [1], [2], [3], [4], [5]. Benefiting from parallel processing capability, inherent nonlinearity, distributed storage and adaptive learning ability, a rich repertoire of ANNs (such as those shown in Fig. 1) have been developed and investigated [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16], [17]. They have been applied widely in many scientific and engineering fields, such as data mining, classification and diagnosis, image and signal processing, control system design, and equations solving. Generally speaking, an ANN is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionistic approach to computation [4], [5]. In view of this point, ANNs can also be termed as simulated neural networks or simply neural networks. Note that, according to the definition of ANN, the combination of inputs, outputs, neurons, and connection weights constitute the architecture of a neural network. Therefore, ANNs can be classified in different categories in terms of architecture. Specifically, according to the nature of connectivity, neural networks can be classified into two categories: feedforward neural networks and feedback neural networks [4], [5]. In general, the working-signal is not fed back between neurons of feedforward neural networks; while the working-signal is fed back between neurons in recurrent neural networks. That is, feedforward neural networks have no loops; while feedback neural networks have loops because of feedback connections [3], [5]. More specifically, in a feedback neural network, signals propagate in only one direction, from an input stage through intermediate neurons to an output stage. Therefore, data from neurons of a lower layer are sent to neurons of an upper layer by feedforward connection networks. By contrast, in a feedback neural network, signals propagate from the output of any neuron to the input of any neuron, and thus they bring data from neurons of an upper layer back to neurons of a lower layer [4], [5].

Being one classical feedforward neural network, the one based on the error back-propagation (BP) algorithm, i.e., BP neural network, was developed through the work of Werbos [18], Rumelhart, McClelland and others [8] in the mid 1970s and 1980s. By following such inspiring thoughts, more and more neural networks based on the BP algorithm or its variants (termed, BP-type neural networks, BPNNs) have been developed and involved in many theoretical analyses and real-world applications [13], [14], [15], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29]. In general, with a number of artificial neurons connected, a feedforward neural network can be constructed, e.g., the one shown in the left of Fig. 1. Then, by means of the iterative BP training procedure, such a neural network would have the marvelous approximation, generalization and prediction abilities. Note that, theoretically speaking, a 3-layer feedforward neural network can approximate any nonlinear continuous function with an arbitrary accuracy. Besides, for these BPNNs, the classical error back-propagation algorithm can be summarized simply asW(k+1)=W(k)+ΔW=W(k)ηe(W)W|W=W(k)where W denotes a vector or matrix of neural weights, and k=0,1,2, denotes the iteration index during the training procedure. In addition, ΔW denotes the weight-updating value at the kth iteration of the training procedure, with η denoting the learning rate (or termed, learning step-size) which should be small enough. Furthermore, e(W) denotes the error function that is used to monitor and control such a BP-training procedure. By iteratively adjusting the weights and biases of the neural network for the sake of minimizing the network-output error e(W), such a BPNN can be trained for mathematical problems solving, function approximation, system identification, or pattern classification. Nowadays, the BPNN is one of the most widely used neural-network models in the computational-intelligence research and engineering fields [13], [14], [15], [21], [22], [23], [24], [25], [26], [27], [28], [29]. For example, Xu et al. developed a BPNN model to map the complex non-linear relationship between microstructural parameters and elastic modulus of the composite [25]. In [26], a BPNN improved by genetic algorithm (GA) was established by Zhang et al. to model the relation between welding appearance and the characteristics of the molten-pool-shadows. Besides, for integrating the BPNN with GA, an iteration optimization approach was presented and investigated by Huang et al. [27].

Being another classical neural network but with feedback, Hopfield neural network was originally proposed by Hopfield in the early 1980s [9]. This seminal work has inspired many researchers to investigate alternative neural networks for solving a wide variety of mathematical problems arising in numerous fields of science and engineering (e.g., matrix inversion in robotic redundancy resolution). Thus, lots of Hopfield-type neural networks (HNNs) have been proposed, developed and investigated [4], [5], [10], [15], [16], [17], [30], [31], [32], [33], e.g., the one shown in the right of Fig. 1. Note that such HNNs are, sometimes, called gradient neural networks, in the sense that the gradient-descent method is generally exploited in the HNN design process. Traditionally speaking, to obtain an HNN for a specific problem solving, a norm-based scalar-valued lower-bounded energy function is firstly constructed such that its minimal point is the solution to the problem; and, secondly, an HNN is developed to evolve along the typical negative-gradient direction of the energy function until the minimum of the energy function is reached. In mathematics, the dynamics-model description of such an HNN for solving a specific problem f(X)=0 [with f(·) being a smooth linear or nonlinear mapping] can be summarized simply and readily asẊ=γε(X)X,where γ>0R is used to scale the convergence rate of the HNN model, and Ẋ denotes the time derivative of state X of the HNN. In addition, the energy function ε(X)=f(X)2/2 with · denoting the two-norm of a vector or the Frobenius norm of a matrix accordingly. Evidently, derived from (2), the resultant HNN models are generally depicted in explicit dynamics. In other words, such HNNs would exhibit dynamical behavior; i.e., given an initial state, the state of an HNN evolves as time elapses [4], [5]. If the HNN is stable without oscillation, an equilibrium state can eventually be reached, which is exactly the solution to the problem f(X)=0. Recently, due to the in-depth research on neural networks, the artificial neural-dynamic approach based on HNNs has been viewed as a powerful alternative to online computation and optimization [4], [5], [15], [16], [17], [30], [31], [32], [33]. Especially, being a class of HNNs, fractional/fractional-order neural networks have been designed, analyzed, and applied to different applications [34], [35], [36], [37], [38], [39].

Comparing BPNNs and HNNs (with their general architectures shown in Fig. 1), we find that they are different from each other in a great deal, e.g., with different origins, concepts, definitions, physical meanings, structures, and training patterns. However, as we may realize, the presented BP algorithm (1) is essentially a gradient-descent based error-minimization method, which adjusts the weights to bring the neural-network input/output behavior into a desired mapping as of some specific application tasks or environments. In addition, as mentioned previously, the gradient-descent method is exploited to construct HNNs for problems solving. Thus, we have asked ourselves a special question: does there exist a relationship between BP-type and Hopfield-type neural networks for a specific problem solving? The answer appears to be YES, which is illustrated in this paper of literature-review type through six positive and different examples together with an application to function approximation.

In detail, this paper presents the common natures of learning between BPNNs and HNNs for solving various mathematical problems encountered in science and engineering fields. More specifically, comparing the BPNN and the HNN for the same problem-solving task (e.g., matrix inversion or function approximation), we show that the BPNN weight-updating formula and the HNN state-transition equation turn out to be essentially the same (or say, mathematically the same). In other words, such two neural networks can essentially possess the same mathematical expressions and computational results, though they are deemed completely different. The interesting phenomenon makes us believe that, given a neural-network model for a specific problem solving, its potential dual neural-network model can be developed correspondingly. Note that HNNs and BPNNs have stood in the fields of neural networks for decades (specifically, since the work of Hopfield [9] in 1982, the work of Werbos [18] in 1974 and the work of Rumelhart and McClelland et al. [8] in 1986). Differing from the separate researches on such two different-type neural networks, this paper, in a relatively complete and creative manner, reveals the connections/links (or termed, common natures of learning) between BPNNs and HNNs for various (mathematical) problems solving which are enriched with an application to function approximation. It is an evident state-of-the-art merit, as it establishes the significant and unique research bridge between BPNNs and HNNs.

Section snippets

BPNN and HNN for matrix inversion

In this section, let us take the matrix inversion (which may be encountered in robotic redundancy resolution) [4], [5], [40], [41] as a case study. We can now develop the corresponding BPNN and HNN via their characteristic design procedures. As we know, in mathematics, the matrix inversion problem can be generally formulated as AX=I, where coefficient matrix ARn×n, identity matrix IRn×n, and XRn×n denote the unknown matrix to be obtained. To lay a basis for further discussion, matrix A is

Unification, comparison and illustration

Comparing the BPNN weight-updating formula (3) and the HNN state-transition Eq. (5) carefully, we can observe that, although the presented BPNN and HNN are completely different, they essentially possess the same mathematical expressions. Specifically, for matrix inversion, the weight matrix W(k) of (3) in BPNN corresponds exactly to the state matrix X(k) of (5) in HNN. Therefore, unifying via the same computational-intelligence governing equation, we find out a common nature of learning (or

Application to function approximation

For the purpose of further substantiating the common natures of learning between BPNNs and HNNs, two models of more general BPNN and HNN are constructed and applied to the function approximation in this section. Specifically, such two models for function approximation are shown in Fig. 4. As for the BPNN, we set all thresholds of the neural network to be zero and fix all connecting weights from the input layer to the hidden layer to be one. Besides, the input layer and the output layer are

Conclusions

This paper has presented and investigated the common natures of learning between BPNNs and HNNs for various (mathematical) problems solving. That is, there exists a common nature of learning between the BPNN being a feedforward neural network and the HNN being a feedback neural network for solving a specific problem (e.g., matrix inversion or function approximation). Such a novel and significative result makes us believe that links possibly exist among many different types of neural networks

Acknowledgments

The authors would like to thank the editors and anonymous reviewers for their valuable suggestions and constructive comments which have really helped the authors improve much further the presentation and quality of the paper.

Dongsheng Guo received the B.S. degree in Automation in 2010 from Sun Yat-sen University, Guangzhou, China, where he is currently working toward the Ph.D. degree in Communication and Information Systems in the School of Information Science and Technology. He is also now with the SYSU-CMU Shunde International Joint Research Institute, Foshan, China, for cooperative research. He has been continuing his research work under the supervision of Prof. Y. Zhang since 2008. His current research

References (46)

  • T. Wang et al.

    Finite-time state estimation for delayed Hopfield neural networks with Markovian jump

    Neurocomputing

    (2015)
  • C. Song et al.

    Dynamics in fractional-order neural networks

    Neurocomputing

    (2014)
  • S. Zhang et al.

    Mittag–Leffler stability of fractional-order Hopfield neural networks

    Nonlinear Anal.: Hybrid Syst.

    (2015)
  • H. Wang et al.

    Global stability analysis of fractional-order Hopfield neural networks with time delay

    Neurocomputing

    (2015)
  • J. Yu et al.

    Projective synchronization for fractional neural networks

    Neural Netw.

    (2014)
  • F. Wang et al.

    Asymptotic stability of delayed fractional-order neural networks with impulsive effects

    Neurocomputing

    (2015)
  • D. Guo et al.

    Zhang neural network, Getz–Marsden dynamic system, and discrete-time algorithms for time-varying matrix inversion with application to robots׳ kinematic control

    Neurocomputing

    (2012)
  • Y. Zhang et al.

    Cross-validation based weights and structure determination of Chebyshev-polynomial neural networks for pattern classification

    Pattern Recogn.

    (2014)
  • D.P. Mandic et al.

    Recurrent Neural Network for Prediction

    (2001)
  • Y. Zhang, Analysis and design of recurrent neural networks and their applications to control and robotic systems (Ph.D....
  • Y. Zhang et al.

    Zhang Neural Networks and Neural-Dynamic Method

    (2011)
  • M.M. Khan et al.

    Fast learning neural networks using Cartesian genetic programming

    Neurocomputing

    (2013)
  • D.E. Rumelhart et al.

    PDP Research Group, Parallel Distributed Processing

    (1986)
  • Cited by (27)

    • Machine learning for predicting the bubble-collapse strength as affected by physical conditions

      2021, Results in Physics
      Citation Excerpt :

      As a massively parallel distributed information processing system, an ANNs learns from surrounding environment by small adjustments to the weights in an iterative manner, then an activation function (nonlinear transformation) is applied to the input to determine its output [40]. While ANNs contain many architecture patterns with unique functions, Back-Propagation neural networks (BPNNs) with excellent ability of nonlinear function approximation [41,42] is employed herein. BPNNs is a kind of forward neural network containing an input layer where the external stimuli is applied, an output layer where shows the model training results, in between these two distinct layers there is one or more layers called the hidden layers with various numbers neurons.

    • Energy optimization of multi-mode coupling drive plug-in hybrid electric vehicles based on speed prediction

      2020, Energy
      Citation Excerpt :

      The final vehicle speed state code is shown in Fig. 9. The main parameter used in the BP neural network prediction algorithm includes input layer “X=(x1,x2, …, xn)T”, hidden layer and output layer “O=(y1,y2, …, yn)T” [28]. The parameters of the input layer mainly include historical vehicle speed and vehicle speed characteristic parameters, the layers number of hidden layer and the number of nodes.

    • Robot manipulator control using neural networks: A survey

      2018, Neurocomputing
      Citation Excerpt :

      It is worth pointing out here that by comparing the weight-updating formula of BP-based neural network with the state-transition equation of Hopfield network for the generalized matrix inversion, authors in [63] show that such two derived learning-expressions turn out to be the same (in mathematics), although the BP and Hopfield-type neural networks are evidently different from each other, a great deal in terms of network architecture, physical meaning, and training patterns. In addition, they extend such an investigation to solve various mathematical problems in [93]. Being the third generation neural network, spiking neural network (SNN) is more related to a real neuron system compared with those networks discussed above.

    • Verification and predicting temperature and humidity in a solar greenhouse based on convex bidirectional extreme learning machine algorithm

      2017, Neurocomputing
      Citation Excerpt :

      In order research the change trends of relative humidity in northern greenhouse, Xu et al. [13] propose a simulation and prediction model of humidity factors based on RBF neural network with Gaussian radial basis function. However, all of the parameters of conventional neural networks (such as BPNN [14], RBF [15], and Elman neural network [16]) are tuned iteratively by using slow gradient-based learning algorithms, making the learning speed of neural network far slower than required. Huang et al. [17] proposed simple and efficient learning steps, both with increased network architecture and incremental extreme learning machine (I-ELM).

    View all citing articles on Scopus

    Dongsheng Guo received the B.S. degree in Automation in 2010 from Sun Yat-sen University, Guangzhou, China, where he is currently working toward the Ph.D. degree in Communication and Information Systems in the School of Information Science and Technology. He is also now with the SYSU-CMU Shunde International Joint Research Institute, Foshan, China, for cooperative research. He has been continuing his research work under the supervision of Prof. Y. Zhang since 2008. His current research interests include neural networks, numerical methods, and robotics.

    Yunong Zhang received the B.S. degree from Huazhong University of Science and Technology, Wuhan, China, in 1996, the M.S. degree from South China University of Technology, Guangzhou, China, in 1999, and the Ph.D. degree from Chinese University of Hong Kong, Shatin, Hong Kong, China, in 2003. He is currently a professor with the School of Information Science and Technology, Sun Yat-sen University, Guangzhou. Before joining Sun Yat-sen University in 2006, he had been with the National University of Singapore, Singapore, the University of Strathclyde, Glasgow, UK, and the National University of Ireland, Maynooth, Ireland, since 2003. He is also now with the SYSU-CMU Shunde International Joint Research Institute, Foshan, China, for cooperative research. His main research interests include neural networks, robotics, computation, and optimization.

    Zhengli Xiao received the B.S. degree in Software Engineering from Changchun University of Science and Technology, Changchun, China, in 2013. He is currently pursuing the M.S. degree in Department of Computer Science with the School of Information Science and Technology, Sun Yat-sen University, Guangzhou. He is also now with the SYSU-CMU Shunde International Joint Research Institute, Foshan, China, for cooperative research. His current research interests include neural networks, intelligent information processing, and learning machines.

    Mingzhi Mao received the B.S., M.S., and Ph.D. degrees in the Department of Computer Science from Sun Yat-sen University, Guangzhou, China, 1988, 1998, and 2008, respectively. He is currently a professor at the School of Information Science and Technology, Sun Yat-sen University, Guangzhou, China. His main research interests include intelligence algorithm, software engineering, and management information system.

    Jianxi Liu received the B.S. degree in Communication Engineering from Sun Yat-sen University, Guangzhou, China, in 2011. He is currently pursuing the M.S. degree in the Department of Automation with the School of Information Science and Technology, Sun Yat-sen University, Guangzhou. He is also now with the SYSU-CMU Shunde International Joint Research Institute, Foshan, China, for cooperative research. His current research interests include neural networks and its application in population research.

    This work is supported by the 2012 Scholarship Award for Excellent Doctoral Student Granted by Ministry of Education of China (No. 3191004), and by the Foundation of Key Laboratory of Autonomous Systems and Networked Control, Ministry of Education, China (No. 2013A07). Besides, kindly note that all authors of the paper are jointly of the first authorship.

    View full text