Artificial Neural Network Identification And Control Of The Inverted Pendulum

82 689 0
Artificial Neural Network Identification And Control  Of The Inverted Pendulum

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Artificial Neural Network identification and control of the inverted pendulum Tim Callinan August 2003 Acknowledgements I would like to thank my supervisor Jennifer Bruton for her help, guidance and support throughout the project Thank you to Conor Maguire for helping me with the inverted pendulum rig and lending me many of his manuals and books Thank you to Anthony Holohan for allowing me to experiment on the inverted pendulum rig Declaration I hereby declare that, except where otherwise indicated, this document is entirely my own work and has not been submitted in whole or in part to any other university Signed:…………………………………………………… Date:……………………… Abstract This project takes the area of Artificial Neural Networks (ANN) and applies it to the inverted pendulum control problem The inverted pendulum is typically used to benchmark new control techniques, as it’s a highly non-linear unstable system Neural networks have unique characteristics, which enable them to control non-linear systems Feedforward and Recurrent neural networks are used to model the inverted pendulum Multi-output online identification was also researched A neuro-controller for the inverted pendulum was developed Traditional control methods were utilized to develop a control law to stabilize the inverted pendulum A feedforward network was trained to mimic the control law Tbe neuro-control shows that if a disturbance occurs in the system, the neural network learns to counteract this disturbance Finally the knowledge learned in identification and control was applied to the real time inverted pendulum rig An online adaptive neural network was developed to model the real time system Table of Contents Introduction Outline of the document Inverted Pendulum .7 Artificial Neural Networks 18 Advantages of ANN’s 19 Types of Learning 20 Neural network structures .20 Multi-layered perceptrons 22 System Identification 24 System identification procedure 25 Linear identification of the system .28 Non-Linear identification of the system .36 Non-linear Identification using neural networks 37 Multi-output identification 46 Neural control of the inverted pendulum 51 Neural-control in simulink: 58 Real-time identification and control 67 Conclusions 75 Summary .75 Scope for future work 79 Bibliography .80 Introduction The process used in this project is the inverted pendulum system The inverted pendulum is a highly nonlinear and open-loop unstable system This means that standard linear techniques cannot model the nonlinear dynamics of the system When the system is simulated the pendulum falls over quickly The characteristics of the inverted pendulum make identification and control more challenging There are two main aims of the project The first is to develop an accurate model of the inverted pendulum system using neural networks The second aim is to develop a neural network controller which determines the correct control action to stabilize the system, but can also learn from experience System identification is the procedure that develops models of a dynamic system based on the input and output signals from the system The input and output data must show some of the dynamics of the process The parameters of the model are adjusted until the output from the model is similar to the output of the real system In order to develop an accurate model of the inverted pendulum, different methods (linear and nonlinear) of identification will be tested One of the problems encountered early in the project is collecting experimental data from the inverted pendulum system The output data from the unstable system does not show enough information or dynamics of the system Feedback controllers are developed which stabilize the system before identification can take place Neural networks have shown great progress in identification of nonlinear systems There are certain characteristics in ANN which assist them in identifying complex nonlinear systems ANN are made up of many nonlinear elements and this gives them an advantage over linear techniques in modelling nonlinear systems ANN are trained by adaptive learning, the network ‘learns’ how to tasks, perform functions based on the data given for training The knowledge learned during training is stored in the synaptic weights The standard ANN structures (feedforward and recurrent) are both used to model the inverted pendulum The main task of this project is to design a neural network controller which keeps the pendulum system stabilized There are main types of neural control – supervised, direct inverse and unsupervised Supervised learning uses an existing controller or human feedback in training the neural network In order to train the neural network to imitate an existing controller a vector of inputs and control targets from the controller must be collected With supervised control, a neural network could be trained to imitate a robust controller The robust controller can operate correctly, if the process operates around a certain point The neuro-controller operates similarly to the robust controller but can also adapt if any disturbance occurs in the system Direct inverse control does not require an existing controller in training A neural network is trained to model the inverse of the process The neural network is cascaded with the process Theoretically if the inverse model is very accurate, the nonlinearities in the ANN will cancel out the nonlinearities in the process Outline of the document Chapter details the research on the inverted pendulum system The dynamic system equations (linear and nonlinear) are derived The simulink models of the linear and nonlinear systems are developed The development of the feedback controllers to stabilize the system is also discussed Chapter covers the theory, structure and operation of artificial neural networks Chapter covers the whole area of system identification The procedure of system identification is discussed first Linear identification techniques are applied to the linear system Nonlinear identification using neural networks is then reported Chapter details the development of the neuro-controller Chapter discusses the real time identification and control using the inverted pendulum rig Finally Chapter provides a summary of the work discussion of the results and scope for future work Inverted Pendulum The inverted pendulum system is a classic control problem that is used in universities around the world It is a suitable process to test prototype controllers due to its high non-linearities and lack of stability The system consists of an inverted pole hinged on a cart which is free to move in the x direction In this chapter, the dynamical equations of the system will be derived, the model will be developed in simulink and basic controllers will be developed The aim of developing an inverted pendulum in simulink is that the developed model will have the same characteristics as the actual process It will be possible to test each of the prototype controllers in the simulink environment Before the inverted pendulum model can be developed in simulink, the system dynamical equations will be derived using ‘Lagrange Equations’ [1] The Lagrangian equations are one of many methods of determining the system equations Using this method it is possible to derive dynamical system equations for a complicated mechanical system such as the inverted pendulum Figure is a free-bodied diagram of the pendulum system M – Mass of the cart m – mass of the pole l – length of the pole f – control force Fig 1: Free body diagram of the inverted pendulum system The Lagrange equations use the kinetic and potential energy in the system to determine the dynamical equations of the cart-pole system The kinetic energy of the system is the sum of the kinetic energies of each mass The kinetic energy, T1 of the cart is • T1 = M y 2 (Eq.1) The pole can move in both the horizontal and vertical directions so the pole kinetic energy is • • 2 T2 = m( y + z2 ) (Eq.2) From the free body diagram y2 and z are equal to • • y = y + l sin θ (Eq.3) z = −l θ sinθ z = l cosθ (Eq.5) y = y + l θ cosθ • • (Eq.4) • (Eq.6) The total kinetic energy, T of the system is equal to T = T1 + T2 = • • • 1 2  M y m y z ( + + 2 )  2  (Eq.7) Equation and are inputted into equation to give equation T= • • •  1  • M y + m  y + y θ l cosθ + l 2θ  2   (Eq.8) The potential energy, V of the system is stored in the pendulum so V = mgz = mgl cosθ (Eq.9) The Lagrangian function is • • • • 1 2 L = T − V = ( M + m ) y + ml cosθ y θ + ml θ − mgl cosθ 2 (Eq.10) The state-space variables of the system are y andθ , so the Lagrange equations are d  ∂L  ∂L =0 − dt  ∂ y•    ∂y d  ∂L  ∂L =0 − dt  ∂ θ•   ∂θ  (Eq.11) (Eq.12) But, ∂L • • ∂y ∂L • = ( M + m) y + ml cos θ θ (Eq.13) =0 (Eq.14) ∂y ∂L • • ∂θ ∂L • = ml cos θ y + ml θ (Eq.15) = mgl sin θ (Eq.16) ∂θ The above equations (Eq 13-16) are inputted into the Lagrange equations (Eq 11-12) and this results in the non-linear dynamical equations for the inverted pendulum system, which are shown below ( M + m) &y& + ml cos θθ&& − mlθ& sin θ = f (Eq.17) ml cos θ &y& − ml sin θ y& θ& + ml 2θ&& − mgl sin θ = (Eq.18) Some of the modelling and control techniques involved in the project are linear so these equations must be linearized It is possible to linearize these equations by approximating cosθ =1 and sin θ =0 It is assumed that θ is kept small The quadratic terms are also negligible Therefore the two linear system equations are &y& = f mg − θ M M f  M +m θ&& = − +  gθ Ml  Ml  (Eq.19) (Eq.20) At this stage, a set of equations (linear & non-linear) describing the inverted pendulum have been developed The next stage is constructing a simulnk model of the inverted pendulum system There is no procedure for developing simulink models from dynamical state equations The diagram below is the linear pendulum model This model is constructed using integrators, gain blocks, etc The model (Fig 2) is simply a simulink representation of the linear state equations Fig : Simulink model of the linear pendulum system The non-linear pendulum system (Fig 4) is shown in the next page The non-linear system, even though it is more complicated is developed in a similar manner Both models are large so it is possible to encapsulate them in subsystem blocks shown below (Fig 3) Both the models are set-up using a mask The mask makes it possible to change the values of m, l, g, etc for different simulations The mass of the cart, M is set to 1.2 Kg, the mass of the pendulum is set to 0.11 Kg and the length of the pendulum is 0.4 meters These figures are taken from the real time inverted pendulum rig 10 Fig : Simulink blocks of the pendulum systems At this stage there is a system that measures the position of the cart and the angle of the pendulum There is also an interface which makes it possible to control the position of the cart The next important part of the pendulum system is the control algorithm in matlab The diagram below shows the real time kernal (RTK) in the matlab environment The RTK is an encapsulated block which covers all the control tasks The input to the RTK block is the desired cart position The outputs of the real time task is a vector which contains information about the pendulum angle, angular velocity, the cart position, cart velocity and the control value for the DC drive Fig 76: Real time task in simulink environment There is no feedback control loop because the controller is embedded in the RTK Two PID controllers are utilized to stabilize the inverted pendulum The first PID controls the angular position of the pendulum The second is used to control the position of the cart The outputs of the PID controllers are added to produce the final DC control signal Figure 77 shows the structure of the PID contollers 68 Fig 77: Structure of the PID controller Previously in the report it has been stated that it is not possible to control the nonlinear pendulum using PID control In this case, it is possible for the RTK to utilize PID control because the pendulum is placed in the upright position in the linearised region before the experiment starts During the experiment the inverted pendulum can sometimes swing past the linearised region and fall over If the pendulum falls down during the experiment, it has to be turned up manually Figure 78 shows the pendulum angle during the simulation It is clear that the PID control stabilizes the inverted pendulum Process pendulum angle 0.01 0.005 Pendulum Angle, rad -0.005 -0.01 -0.015 -0.02 200 400 600 800 1000 1200 Time Fig 78: Plot of the pendulum angle 1400 69 The next stage is to develop a neural network which identifies the pendulum system online A multi-layered perceptron (MLP) is used to model the angle of the pendulum, θ and the position of the cart, y Figure 79 shows the setup of the identification process The PID controller is used to stabilize the inverted pendulum Closed loop identification is necessary for open loop unstable systems The control signal input to the pendulum system is inputted into the ANN The error signal, eθ , ey between the process output and the ANN output is backpropagated to adjust the weights of the MLP θ1 eθ ey y1 θ2 y2 Fig 79: Setup of the online identification process Figure 80 shows the simulink setup of the real time identification A multi-layered perceptron with 100 neurons in the hidden layer and a learning rate of 0.05 is utilized The input into the MLP is the control signal from the PID controller The MLP has outputs modelling the pendulum angle and the cart position 70 Fig 80: Simulink setup of the online identification Figure 81 shows the plot of the process pendulum angle and the neural network angle The blue graph is the inverted pendulum angle and the red is the neural model output The MLP shows that it is possible to identify the pendulum angle online 0.01 0.005 Pendulum -0.005 Angle, rad -0.01 -0.015 -0.02 200 400 600 800 1000 1200 1400 Time Fig 81: Plot of the pendulum angle 71 Figure 82 shows the plot of the process cart position The blue graph is the cart position and the red is the neural model output 0.1 0.08 0.06 0.04 0.02 Meters, m -0.02 -0.04 -0.06 -0.08 100 200 300 400 500 600 700 800 Time Fig 82: Plot of the cart position The real time kernal for the inverted pendulum system contains many different types of controller- PID, nonlinear control law, etc It is possible to develop and test prototype controllers using the external controller function The external controller is a file which contains the control routine which is accessed at the interrupt time The control algorithm eg: neural, fuzzy, adaptive, etc must be written in C code The develop a correct external controller the input/output architecture and the limits of the signals must be obeyed In order to develop an external neural controller for the real time inverted pendulum it will have to be written in C The online neural toolbox used previously in the project contains multiplayer perceptrons (MLP) which are all written in C Instead of writing a neural controller from scratch, a MLP could be adapted to the structure of external controller format Before the external controller is developed, a MLP must be trained to control the inverted pendulum It is possible to train a neural network to imitate the existing PID controller 72 Figure 83 shows the training of a neural network The inputs to the MLP are the pendulum angle and the cart position The error signal between the output of the MLP and the PID control signal is backpropagated to adjust the neural weights Fig 83: Training a MLP to imitate the PID controller Figure 84 shows the plot of the real control signal and the output from the MLP The blue signal is the PID control and the red is the neural controller output 1.5 0.5 Control signal -0.5 -1 -1.5 50 100 150 200 250 300 Time Fig 84: Plot showing the MLP output and PID output 73 At this stage, there is a trained MLP which imitates the PID controller The weights in the MLP are stored The next step is developing a neural controller in C and adapting it to external controller format Unfortunately there was not enough time in the project to implement the real time neural controller If a neural controller was implemented in C, the neural weights from the trained network in Fig 83 could be transferred to the new network and set as initial weights In theory this ANN should be able to control the real time inverted pendulum 74 Conclusions Summary This research has applied artificial neural networks to the identification and control of the inverted pendulum Before identification techniques could be tested, a model representing the inverted pendulum was developed in simulink Some of the modelling and control techniques involved in the project are linear so a linearized version of the inverted pendulum was developed Open loop identification was initially tested but it was found that the inverted pendulum is open loop unstable One of the requirements for accurate identification is experimental input-output data that shows the dynamics of the system It was decided that system identification would be performed in closed-loop so stabilizing feedback controllers had to be developed for the linear and nonlinear inverted pendulum A simple full-state feedback controller stabilized the linear pendulum and a control law was developed to stabilize the nonlinear pendulum The closed loop data is stable and the inverted pendulum can be simulated for longer times so more data can be collected Linear identification techniques were applied to the linear pendulum The aim was to develop a transfer function block that accurately modelled the inverted pendulum An accurate model will have a low MSE in relation to the process and the model will show some of the dynamics of the process The four types of model tested are ARX, ARMAX, Box-Jenkins and OutputError It was found that the ARX and ARMAX could not model the inverted pendulum at all The Box-Jenkins and Output-Error models had low MSE but did not show any of the dynamics of the inverted pendulum system The few journals on closed loop identification have all indicated that the best linear model structures are Box-Jenkins and Output-Error ARX and ARMAX both make assumptions that the noise spectrum models and the inputoutput models have the same characteristic dynamics This explains why the ARX and ARMAX could not model the linear inverted pendulum One of the reasons why the BoxJenkins and Output-Error models did not show any of the process dynamics is due to the closed loop identification Closed loop identification must be used on open-loop unstable 75 systems such as the inverted pendulum but one of the disadvantages is that feedback controllers mask some of the dynamics of the system A detuned controller was used to control the process This type of controller keeps the inverted pendulum barely stable but more of the process dynamics can be seen The BoxJenkins and Output-Error identification resumed with much better results The main conclusions from the linear identification are: When trying to model an unstable system a feedback controller must be used to keep the system stable If the controller is de-tuned this will allow more of the pendulum dynamics to be seen This will make the model more accurate The Box-jenkins/Output error models are the only structures that can adequately model the pendulum using the closed loop data The best way to test the quality of a model is to construct a transfer block of the model and simulate it using a different initial input seed To achieve a better approximation of the inverted pendulum, the nonlinear system must be used The linear identification techniques were applied to the nonlinear pendulum system and were found to be inadequate in modelling the nonlinear nature of the system The nonlinear nature of neural networks gives them an advantage over linear models in the prediction of non-linear systems Before the inverted pendulum system is identified, the process is stabilized using the control law The control law removes some of the nonlinearities from the process so a detuned control law was used which allows the process to exhibit more of its dynamics This improves the quality of the data used in the system identification Initially single-input single-output networks were developed, the input being the control force and the output pendulum angle The first type of neural network to be developed are feedforward Feedforward networks with a range of hidden layer neurons were tested The feedforward networks modelled the inverted pendulum well The MSE between the process and the neuron model is low and the model predicts the dynamics of the pendulum angle In open-loop identification, increasing the number of hidden layer neurons will have a direct 76 influence on the accuracy of the model In the closed loop case, it was found that using a detuned controller had more of an influence on the model accuracy than increasing the number of hidden layer neurons Recurrent ‘Elman’ networks are the second type of neural networks to be developed Elman networks have built in feedback loops which enable them to model dynamic systems such as the inverted pendulum more accurately than static feedforward networks Elman networks with different sizes of hidden layer were tested The results indicate that Elman networks were not as accurate as feedforward in approximating the inverted pendulum It was found that when training the Elman networks to model the inverted pendulum, the training would get stuck at a local minima This affected the accuracy of the models developed The poor results of the Elman networks are due to the fact that the training data is from a closed loop system The next stage in system identification was to develop a multi-output neural network which models the four outputs of the inverted pendulum A feedforward network with 100 hidden layer neurons was used to model the process The neural network developed could model the pendulum angle and the cart position accurately but completely fails to model the velocity of the cart and angular velocity of the pendulum The main task in the project was to design a controller which keeps the pendulum system inverted The four main types of neural control (supervised, unsupervised, direct inverse and internal model control) were researched to determine which control technique would be the most efficient to implement The earliest application of neural networks to the inverted pendulum is by Widrow and Smith [25] and Widrow [26] They used traditional control methods to derive a control law to stabilize the linearized system They then trained a neural network to mimic the output of the control law It was decided that supervised control would be the least complex to implement It was not possible to develop direct inverse control because this control method requires that the process to be controlled is already open-loop stable The unsupervised control technique developed by Anderson was just too complex for the project time frame The first neuro-controller was developed by training a feedforward network to model the control law Elman networks were also used here to model the control 77 law but were not as accurate When the training was finished the neural network was exported into simulink and the network was placed in the feedback loop instead of the existing control law The neural network controlled the inverted pendulum similar to the control law An experiment was set-up which creates a disturbance to the process during the simulation The neural network lost control of the inverted pendulum because it was unable to adjust its weights to counteract this disturbance This problem was solved by using the adaptive neural toolbox This toolbox makes it possible for online neural learning to occur Two types of neural network were used – Adaline and multi-layered perceptron (MLP) The ANN was trained offline using the control law The advantage of using this type of network is if a disturbance occurs during operation, the error signal is fed back into the Adaline which adjusts the weights of the network and this counteracts the disturbance The Adaline adaptive block is designed for approximating ‘almost linear’ functions It was found that the Adaline could approximate the control law very accurately It was decided to test some of the identification and control techniques on the real time inverted pendulum rig The real-time inverted pendulum is also open-loop unstable The real time kernal (RTK) uses standard PID controllers to stabilize the system Online identification was possible using the adaptive neural toolbox It was not possible to develop a neural controller for the real time system but significant progress was made 78 Scope for future work The results from the Elman networks were not as accurate as the feedforward networks The dynamic Elman networks should have been more accurate when modelling a dynamic system such as the inverted pendulum This could be investigated When modelling the inverted pendulum closed loop identification must be used One of the faults of closed loop identification is the controller removes some of the dynamics of the process More research is needed in developing models from closed loop data The neural network controllers developed in the project were all based on the traditional control law developed When training an ANN using supervised learning there must be an existing controller to copy In order to develop a control law the dynamics of the process must be known If it is not possible to develop a control law or the dynamics of the process are not known then there is no way to train a neural network A solution to this problem is developing an unsupervised controller Unsupervised control does not require an accurate model of the system dynamics or the systems desired behaviour The only feedback signal to the controller is a failure signal when the pendulum falls past a certain angle The control signal must learn through experience by trying various actions The work done by Anderson [23] in unsupervised control gives practical guidelines in developing a controller The next possible future research could be on unsupervised control of the inverted pendulum Supervised control with neural networks has been done a thousand times now and unsupervised control is a more difficult but interesting problem 79 Bibliography [1] Friedland, Bernard (1987), Control System design, New York: McGraw-Hill, pp 30-52 [2] Ljung, L (1987) System Identification-Theory for the user, Prentice Hall [3] Nechyba and Xu, (1994), “Neural network approach to control system identification with variable activation functions”, Robotics Institute, Carnegie-Mellon University [4] Guez, A., Selinsky,J., “A trainable neuromorphic controller, Journal of robotic systems”, Vol 5, No.4, pp 363-388, 1988 [5] Davalo, Naim Neural Networks, Macmillan [6] Hunt and Sbarbaro, “Neural Networks for Control System - A Survey”, Automatica, Vol 28, 1992, pp 1083-1112 [7] Neural Network Toolbox Users Guide, October 1998, The Mathworks Inc [8] Pham and Liu, Neural Networks for Identification, Prediction and Control, Springer [9] Cybenko,G “Approximation by superposition of a Sigmoidal Function, Mathematics of Control, Signals and Systems”, Vol 2, No 4, pp 303-314, 1989 [10] Saerens M., Soquet A., “Neural Controller based on back-propagation algorithm”, IEE Proceedings –F, Vol 138, No.1, pp 55-62,1991 80 [11] Narendra K.S., Parthasarathy K., “Identification an control of dynamical systems using neural networks, IEEE transactions on neural networks”, Vol.1, No.1, pp 4-27 [12] Ljung, L System Identification-Theory for the user, Prentice Hall [13] Billings, S.A., “Introduction to nonlinear system analysis and identification” In K Godfrey and P Jones, Signal Processing for control, Springer-Verlag, Berlin [14] Ljung, L System Identification-Theory for the user, Prentice Hall [15] Johansson, Rolf – System modelling and identification, Prentice Hall [16] Snow,W and Emigholz,K., “Increase model predictive control (MPC) project efficiency by using a modern identification method.”, ERTC Computing, Paris, France [17] Hunt and Sbarbaro, “Neural Networks for Control System - A Survey”, Automatica, Vol 28, 1992, pp 1083-1112 [18] Hagan,M and Demuth,H- Neural Network Design, Boston,PWS, 1996 [19] Marco,P and Raul,L “Application of several neurocontrolschemes to a DOF manipulator” [20] Magnus Norgaard, Neural Network Design Toolkit, http://www.iau.dtu.dk/research/control/nnlib/manual.pdf [21] Hunt and Sbarbaro, “Neural Networks for Control System - A Survey”, Automatica, Vol 28, 1992, pp 1083-1112 81 [22] Barto, Sutton and Anderson, “Neuronlike adaptive elements that can solve difficult learning control problems”, IEEE Trans on Systems, Man and Cybernetics, Vol SMC-13, pp834-846, Sept-Oct 1983 [23] C.W Anderson “Learning to control an inverted pendulum using neural networks”, IEEE Controls Systems Magazine, 9:31-37, 1989 [24] Campa, Fravolini, Napolitano- “A library of Adaptive neural networks for control purposes.” The simulink library can be downloaded from the Mathworks file exchange website in the ANN section http://www.mathworks.com/matlabcentral/fileexchange/ [25] Widrow, B and Smith,F., “Pattern-Recognising Control Systems,”1963 Computer and Information Sciences (COINS) Symp Proc., Washington DC: Spartan, pp288-317, 1964 [26] Widrow, B., “The Original Adaptive Neural Net Broom-Balancer,” Int Symp Circuits and Systems, Vol.5, no.4, pp 363-388, Aug 1988 82

Ngày đăng: 24/09/2016, 17:26

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan