RECURRENT NEURAL NETWORKS DERIVED FROM THE STATE-SPACE MODEL

  • ABSTRACT

    A class of fully recurrent neural networks structured to resemble the state-space model is proposed, together with a measure of goodness. A real-time version of the backpropagation through time learning algorithm is presented. The modification of the backpropagation through time learning algorithm is modelled on neural networks whose feedbacks are from the last layer, while its extension to other types (sources) of feedback is discussed. Experiments on estimating ARMA parameters have shown that, without any a priori knowledge, these neural networks can estimate parameters of non-linear dynamic processes as effectively as a more complex system using only feedforward neural networks. The performance of the new networks compares well with that of previously reported, and more complex, partially recurrent neural networks developed specially for the purpose of identification and modelling of higher-order systems.


  • KEYWORDS

    : recurrent neural networks, neural networks, learning algorithm, backpropagation, backpropagation through time, system identification, state-space model, ARMA model


  • INTRODUCTION

    One of the central questions in system theory is identification/modelling. The problem of identification/modelling is described in 1.1 and the most common models developed for that purpose presented. Since real systems are generally non-linear, identification of non-linear systems is expected to be of particular interest. Despite this, in practice there is a general tendency to identify/assume a system as linear. Differences between linear and non-linear systems are summarised in 1.1.1 underlying reasons for frequently avoidance of non-linear theory. Presumably, the most important reason is the nonexistence of canonical structures and common approaches for non-linear systems, while they are available in the linear theory. Neural networks (NNs), themselves dynamic and possibly non-linear adaptive systems, look very promising for the purpose of non-linear identification. A fundamental description of NNs used in this article (perceptron and multilayered perceptron) is provided in 1.2. Although NNs are often understood as systems without internal memory (except for their learning capability), there is a class known as recurrent NNs provided with that feature. Three models of recurrent NNs, one of them developed particularly for system identification (modified ElmanŐs NN), are introduced in 1.2.1. A NN requires, and its success relies on, a learning algorithm. Sections 1.3 and 1.3.1 deal with learning algorithms for ordinary/feedforward and recurrent NNs, respectively. The most popular algorithm, backpropagation, is presented in Appendix A and its extension to recurrent NNs, backpropagation through time, in Appendix C. Another learning algorithm developed particularly for recurrent NNs, real-time recurrent learning, is given in Appendix B. There exist two different notions of what should be called the first layer of a NN. According to one the first layer is also a processing layer, while according to the other it is the layer of input buffers. In this article the first layer is assumed to be the processing layer. Application of NNs to the system identification task is a very recent area of research, but in the last few years there have been some limited efforts to (develop and) apply recurrent NNs for that purpose. The major drawback is the lack of simple real-time learning algorithms for recurrent neural networks, but other difficulties are the limited number of recurrent models proposed so far and their tenuous connection with existing models in system theory. Some approaches, and their drawbacks, to system identification using NNs are reported in 1.4, while initial results with recurrent NNs available so far are presented in 1.4.1. The new class of recurrent NNs is derived from the state-space model (introduced in 1.1) in 2, and related to the most important models for system identification described in 1.1. The neural networks proposed, unlike existing recurrent neural networks, have context layers integrated with processing layers and are fully recurrent, which implies that all weights of feedback connections are trainable. The structure of NNs of the new class is shown to reduce, in some special cases, to the structures of NNs presented in 1.2 and 1.2.1. It is of great practical value to have a measure that shows how well the system is identified by the NN. Such a measure of goodness, as a goal function, is derived in 2, using general time series analysis (introduced in 1.1). The measure of goodness can be used as a goal function for an optimisation technique. To teach the NNs of the proposed class, as well as other NNs having time-delayed connections, the backpropagation through time algorithm (introduced in 1.3.1 and described in Appendix C) is modified for real-time learning. The modification and its application are described in 2.1. In the chapter 3 a chosen representative of the class proposed in 2 is compared with existing models. First in 3.1 it is compared with the identification system (introduced in 1.4) using feedforward NNs (introduced in 1.2) on a non-linear system identification, and then with recurrent NNs (introduced in 1.2.1) on second- and third-order linear system identification in 3.2 and 3.3, respectively. The results are discussed in 3.4. Finally a conclusion is given in 4. Since the derivation of the proposed class and the modification of the learning algorithm are considered as an initial step toward application of recurrent NNs for real system identification two directions for further research are suggested.