

small (250x250 max)
medium (500x500 max)
Large
Extra Large
large ( > 500x500)
Full Resolution


USING GENETIC ALGORITHMS TO OPTIMIZE CONTROL LYAPUNOV FUNCTIONS By BRIAN K. HARGRAVE Bachelor of Science in Mechanical Engineering Oklahoma State University Stillwater, OK 1999 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of MASTER OF SCIENCE July, 2008 ii USING GENETIC ALGORITHMS TO OPTIMIZE CONTROL LYAPUNOV FUNCTIONS Thesis Approved: Dr. L. L. Hoberock Thesis Adviser Dr. P. R. Pagilla Dr. G. E. Young Dr. A. Gordon Emslie Dean of the Graduate College iii Table of Contents Chapter Page 1 INTRODUCTION................................................................................................................................1 1.1 MOTIVATION................................................................................................................................1 1.2 PROBLEM STATEMENT AND BACKGROUND THEORY.....................................................................7 1.3 THESIS OUTLINE .........................................................................................................................15 2 A GENETIC ALGORITHM FOR CLF OPTIMIZATION.................................................................17 2.1 GENETIC ALGORITHM MOTIVATION ...........................................................................................17 2.2 TAILORING TO THE SPECIFIC PROBLEM.......................................................................................18 2.3 GENETIC ALGORITHM DESCRIPTION ...........................................................................................21 2.4 STOCHASTIC ESTIMATION OF min γ .............................................................................................27 3 FULL STATEFEEDBACK CLF CONTROL....................................................................................32 3.1 SELECTION OF BENCHMARK EXAMPLES .....................................................................................32 3.2 EXAMPLE 1: ARTIFICIAL SYSTEM..............................................................................................35 3.3 EXAMPLE 2: DOUBLE INVERTED PENDULUM SYSTEM...............................................................78 3.4 EXAMPLE 3: CARTANDPOLE SYSTEM ...................................................................................106 3.5 DISCUSSION OF RESULTS ..........................................................................................................132 4 FUTURE DIRECTIONS AND CONCLUSIONS.............................................................................135 4.1 GENERALIZED CONTROL LYAPUNOV FUNCTIONS.....................................................................135 4.2 IMPROVED THE CRITICAL POINT COMPUTATION.......................................................................137 4.3 OUTPUTFEEDBACK, ADAPTIVE, AND ROBUST CONTROL .........................................................140 4.4 DISCRETETIME CONTROL........................................................................................................140 4.5 PUTTING IT ALL TOGETHER FOR PRACTICAL CONTROL DESIGN ..............................................141 4.6 CONCLUSION............................................................................................................................153 REFERENCES...........................................................................................................................................155 APPENDIX: MATLAB CODE LISTING AND HOW TO USE IT ..........................................................157 iv List of Tables Table Page TABLE 31: LLARC PARAMETERS FOR SYSTEM (3.3) (NGAMSOM, 2001) .....................................................36 TABLE 32: PERFORMANCE OF LLARC ON SYSTEM (3.3)..............................................................................36 TABLE 33: PERFORMANCE OF NLARC ON SYSTEM (3.3) .............................................................................36 TABLE 34: PERFORMANCE OF LLARC ON LINEARIZED SYSTEM (3.4)..........................................................37 TABLE 35: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.3) ..........................................................44 TABLE 36 OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.3)................................................................45 TABLE 37: PERFORMANCE OF LGAC ON SYSTEM (3.3) ................................................................................46 TABLE 38: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.4)............................................................46 TABLE 39: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.3)..........................................................56 TABLE 310: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.3) ............................................................57 TABLE 311: PERFORMANCE OF NGAC ON SYSTEM (3.3)..............................................................................57 TABLE 312: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.4) .........................................................58 TABLE 313: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.3) (HIGH DENSITY 0 d X )......................67 TABLE 314: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.3) (HIGH DENSITY 0 d X ) ..........................68 TABLE 315: PERFORMANCE OF NGAC ON SYSTEM (3.3) (HIGH DENSITY 0 d X )............................................69 TABLE 316: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.4) (HIGH DENSITY 0 d X ) .......................69 TABLE 317: LLARC PARAMETERS FOR SYSTEM (3.5) (NGAMSOM, 2001) ...................................................79 TABLE 318: PERFORMANCE OF LLARC ON SYSTEM (3.5)............................................................................80 TABLE 319: PERFORMANCE OF NLARC ON SYSTEM (3.5) ...........................................................................80 TABLE 320: PERFORMANCE OF LLARC ON LINEARIZED SYSTEM (3.6)........................................................81 TABLE 321: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE) .....................84 TABLE 322: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE)..........................85 TABLE 323: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.6) (SMALL 0 d X RANGE) .......................86 TABLE 324: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE) ......................90 TABLE 325: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE)...........................90 TABLE 326: PERFORMANCE OF LGAC ON SYSTEM (3.5) (LARGE 0 d X RANGE) ............................................91 TABLE 327: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.6) (LARGE 0 d X RANGE) ........................91 TABLE 328: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE)......................96 TABLE 329: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE) ..........................96 TABLE 330: PERFORMANCE OF NGAC ON SYSTEM (3.5) (SMALL 0 d X RANGE)............................................97 TABLE 331: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.6) (SMALL 0 d X RANGE) .......................97 TABLE 332: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE)....................101 TABLE 333: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE) ........................101 TABLE 334: PERFORMANCE OF NGAC ON SYSTEM (3.5) (LARGE 0 d X RANGE)..........................................102 TABLE 335: PERFORMANCE OF NGAC ON SYSTEM (3.6) (LARGE 0 d X RANGE)..........................................102 TABLE 336: LLARC PARAMETERS FOR SYSTEM (3.7) (NGAMSOM, 2001) .................................................108 TABLE 337: PERFORMANCE OF LLARC ON SYSTEM (3.7)..........................................................................108 TABLE 338: PERFORMANCE OF NLARC ON SYSTEM (3.7) .........................................................................108 v List of Tables (Cont'd) Table Page TABLE 339: PERFORMANCE OF LLARC ON LINEARIZED SYSTEM (3.8)......................................................109 TABLE 340: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE) ...................111 TABLE 341: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE)........................112 TABLE 342: PERFORMANCE OF LGAC ON SYSTEM (3.7) (SMALL 0 d X RANGE) .........................................112 TABLE 343: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.8) (SMALL 0 d X RANGE) ......................113 TABLE 344: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE) ....................116 TABLE 345: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE).........................117 TABLE 346: PERFORMANCE OF LGAC ON SYSTEM (3.7) (LARGE 0 d X RANGE) ..........................................117 TABLE 347: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.8) (LARGE 0 d X RANGE) ......................118 TABLE 348: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE)...................121 TABLE 349: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE) .......................122 TABLE 350: PERFORMANCE OF NGAC ON SYSTEM (3.7) (SMALL 0 d X RANGE).........................................122 TABLE 351: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.8) (SMALL 0 d X RANGE).....................123 TABLE 352: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE)....................127 TABLE 353: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE) ........................127 TABLE 354: PERFORMANCE OF NGAC ON SYSTEM (3.7) (LARGE 0 d X RANGE)..........................................128 TABLE 355: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.8) (LARGE 0 d X RANGE)......................128 vi List of Figures Figure Page FIGURE 21: FLOW OF FITNESS COMPUTATION ..............................................................................................23 FIGURE 22: GA OPTIMIZATION ALGORITHM ................................................................................................31 FIGURE 31: LEVEL SETS OF V FOR LLARC ON SYSTEM (3.3) .......................................................................37 FIGURE 32: “GAMMA” (γ ) FOR LLARC ON SYSTEM (3.3)..........................................................................38 FIGURE 33: STABILITY TABLE FROM NGAMSOM (2001) ...............................................................................39 FIGURE 34: STATES OF SYSTEM (3.3) USING LLARC...................................................................................40 FIGURE 35: CONTROL SIGNAL OF LLARC ON SYSTEM (3.3) ........................................................................40 FIGURE 36: LEVEL SETS OF V FOR NLARC ON SYSTEM (3.3).......................................................................42 FIGURE 37: “GAMMA” (γ ) FOR NLARC ON SYSTEM (3.3) .........................................................................42 FIGURE 38: STATES OF SYSTEM (3.3) USING NLARC ..................................................................................43 FIGURE 39: CONTROL SIGNAL OF NLARC ON SYSTEM (3.3)........................................................................43 FIGURE 310: LEVEL SETS OF V FOR LGAC ON SYSTEM (3.3) (W=[1 0]).......................................................47 FIGURE 311: “GAMMA” (γ ) FOR LGAC ON SYSTEM (3.3) (W=[1 0])..........................................................48 FIGURE 312: CRITICAL POINTS FOR LGAC POPULATION USING SYSTEM (3.3) (W=[1 0])............................48 FIGURE 313: STATES OF SYSTEM (3.3) USING LGAC (W=[1 0]) ..................................................................50 FIGURE 314: CONTROL SIGNAL OF LGAC ON SYSTEM (3.3) (W=[1 0])........................................................50 FIGURE 315: LEVEL SETS OF V FOR LGAC ON SYSTEM (3.3) (W=[1 1]).......................................................51 FIGURE 316: “GAMMA” (γ ) FOR LGAC ON SYSTEM (3.3) (W=[1 1])..........................................................51 FIGURE 317: CRITICAL POINTS FOR LGAC POPULATION USING SYSTEM (3.3) (W=[1 1])............................52 FIGURE 318: STATES OF SYSTEM (3.3) USING LGAC (W=[1 1]) ..................................................................52 FIGURE 319: CONTROL SIGNAL PROFILE OF LGAC ON SYSTEM (3.3) (W=[1 1])..........................................53 FIGURE 320: LEVEL SETS OF V FOR LGAC ON SYSTEM (3.3) (W=[0 1]).......................................................53 FIGURE 321: “GAMMA” (γ ) FOR LGAC ON SYSTEM (3.3) (W=[0 1])..........................................................54 FIGURE 322: CRITICAL POINTS FOR LGAC POPULATION USING SYSTEM (3.3) (W=[0 1])............................54 FIGURE 323: STATES OF SYSTEM (3.3) USING LGAC (W=[0 1]) ..................................................................55 FIGURE 324: CONTROL SIGNAL OF LGAC ON SYSTEM (3.3) (W=[0 1])........................................................55 FIGURE 325: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 0]) ......................................................59 FIGURE 326: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[1 0]) .........................................................59 FIGURE 327: CRITICAL POINTS FOR NGAC POPULATION USING SYSTEM (3.3) (W=[1 0]) ...........................60 FIGURE 328: STATES OF SYSTEM (3.3) USING NGAC (W=[1 0])..................................................................60 FIGURE 329: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 0]) .......................................................61 FIGURE 330: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 1]) ......................................................62 FIGURE 331: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) USING W=[1 1] .................................................62 FIGURE 332: CRITICAL POINTS FOR NGAC POPULATION USING SYSTEM (3.3) (W=[1 1]) ...........................63 FIGURE 333: STATES OF SYSTEM (3.3) USING NGAC (W=[1 1])..................................................................63 FIGURE 334: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 1]) .......................................................64 FIGURE 335: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[0 1]) ......................................................65 FIGURE 336: “GAMMA” (γ ) FOR NGAC ON ARTIFICIAL SYSTEM (3.3) (W=[0 1]) ......................................65 FIGURE 337: CRITICAL POINTS FOR NGAC POPULATION USING SYSTEM (3.3) (W=[0 1]) ...........................66 FIGURE 338: STATES OF SYSTEM (3.3) USING NGAC (W=[0 1])..................................................................66 FIGURE 339: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[0 1]) .......................................................67 FIGURE 340: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X )......................70 FIGURE 341: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X ) .........................70 FIGURE 342: CRITICAL POINTS FOR NGAC USING SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X )................71 FIGURE 343: STATES OF SYSTEM (3.3) USING NGAC (W=[1 0], HIGH DENSITY 0 d X ) .................................71 vii List of Figures (Cont'd) Figure Page FIGURE 344: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X ).......................72 FIGURE 345: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ) .....................72 FIGURE 346: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ) .........................73 FIGURE 347: CRITICAL POINTS FOR NGAC USING SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ) ...............73 FIGURE 348: STATES OF ARTIFICIAL SYSTEM (3.3) USING NGAC (W=[1 1], HIGH DENSITY 0 d X ) ..............74 FIGURE 349: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ).......................74 FIGURE 350: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X )......................75 FIGURE 351: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X ) .........................75 FIGURE 352: CRITICAL POINTS FOR NGAC USING SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X )................76 FIGURE 353: STATES OF SYSTEM (3.3) USING NGAC (W=[0 1], HIGH DENSITY 0 d X ) .................................76 FIGURE 354: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X ).......................77 FIGURE 355: DOUBLE INVERTED PENDULUM SYSTEM (MISAWA ET AL, 1995) .............................................78 FIGURE 356: STATES OF SYSTEM (3.5) USING LLARC.................................................................................82 FIGURE 357: CONTROL SIGNAL OF SYSTEM (3.5) USING LLARC.................................................................82 FIGURE 358: STATES OF SYSTEM (3.5) USING NLARC ................................................................................83 FIGURE 359: CONTROL SIGNAL OF SYSTEM (3.5) USING NLARC................................................................83 FIGURE 360: STATES OF SYSTEM (3.5) USING LGAC (W=[1 0], SMALL 0 d X RANGE)..................................87 FIGURE 361: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 0], SMALL 0 d X RANGE)..................87 FIGURE 362: STATES OF SYSTEM (3.5) USING LGAC (W=[1 1], SMALL 0 d X RANGE)..................................88 FIGURE 363: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 1], SMALL 0 d X RANGE)..................88 FIGURE 364: STATES OF SYSTEM (3.5) USING LGAC (W=[0 1], SMALL 0 d X RANGE)..................................89 FIGURE 365: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[0 1], SMALL 0 d X RANGE)..................89 FIGURE 366: STATES OF SYSTEM (3.5) USING LGAC (W=[1 0], LARGE 0 d X RANGE) ..................................93 FIGURE 367: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 0], LARGE 0 d X RANGE)..................93 FIGURE 368: STATES OF SYSTEM (3.5) USING LGAC (W=[1 1], LARGE 0 d X RANGE) ..................................94 FIGURE 369: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 1], LARGE 0 d X RANGE)..................94 FIGURE 370: STATES OF SYSTEM (3.5) USING L GAC (W=[0 1], LARGE 0 d X RANGE) .................................95 FIGURE 371: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[0 1], LARGE 0 d X RANGE)..................95 FIGURE 372: STATES OF SYSTEM (3.5) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .................................98 FIGURE 373: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .................98 FIGURE 374: STATES OF SYSTEM (3.5) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .................................99 FIGURE 375: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .................99 viii List of Figures (Cont'd) Figure Page FIGURE 376: STATES OF SYSTEM (3.5) USING NGAC (W=[0 1], SMALL 0 d X RANGE) ...............................100 FIGURE 377: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[0 1], SMALL 0 d X RANGE) ...............100 FIGURE 378: STATES OF SYSTEM (3.5) USING NGAC (W=[1 0], LARGE 0 d X RANGE)................................103 FIGURE 379: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 0], LARGE 0 d X RANGE) ...............103 FIGURE 380: STATES OF SYSTEM (3.5) USING NGAC (W=[1 1], LARGE 0 d X RANGE)................................104 FIGURE 381: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 1], LARGE 0 d X RANGE) ...............104 FIGURE 382: STATES OF SYSTEM (3.5) USING NGAC (W=[0 1], LARGE 0 d X RANGE)................................105 FIGURE 383: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[0 1], LARGE 0 d X RANGE) ...............105 FIGURE 384: STATES OF SYSTEM (3.7) USING LLARC...............................................................................109 FIGURE 385: CONTROL SIGNAL OF SYSTEM (3.7) USING LLARC...............................................................110 FIGURE 386: STATES OF SYSTEM (3.7) USING NLARC ..............................................................................110 FIGURE 387: CONTROL SIGNAL OF SYSTEM (3.7) USING NLARC..............................................................111 FIGURE 388: STATES OF SYSTEM (3.7) USING LGAC (W=[1 0], SMALL 0 d X RANGE)................................113 FIGURE 389: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 0], SMALL 0 d X RANGE)................114 FIGURE 390: STATES OF SYSTEM (3.7) USING LGAC (W=[1 1], SMALL 0 d X RANGE)................................114 FIGURE 391: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 1], SMALL 0 d X RANGE)................115 FIGURE 392: STATES OF SYSTEM (3.7) USING LGAC (W=[0 1], SMALL 0 d X RANGE)................................115 FIGURE 393: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[0 1], SMALL 0 d X RANGE)................116 FIGURE 394: STATES OF SYSTEM (3.7) USING LGAC (W=[1 0], LARGE 0 d X RANGE) ................................118 FIGURE 395: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 0], LARGE 0 d X RANGE)................119 FIGURE 396: STATES OF SYSTEM (3.7) USING LGAC (W=[1 1], LARGE 0 d X RANGE) ................................119 FIGURE 397: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 1], LARGE 0 d X RANGE)................120 FIGURE 398: STATES OF SYSTEM (3.7) USING LGAC (W=[0 1], LARGE 0 d X RANGE) ................................120 FIGURE 399: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[0 1], LARGE 0 d X RANGE)................121 FIGURE 3100: STATES OF SYSTEM (3.7) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .............................124 FIGURE 3101: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .............124 FIGURE 3102: STATES OF SYSTEM (3.7) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .............................125 FIGURE 3103: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .............125 FIGURE 3104: STATES OF SYSTEM (3.7) USING NGAC (W=[0 1], SMALL 0 d X RANGE) .............................126 FIGURE 3105: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[0 1], SMALL 0 d X RANGE) .............126 FIGURE 3106: STATES OF SYSTEM (3.7) USING NGAC (W=[1 0], LARGE 0 d X RANGE)..............................129 FIGURE 3107: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 0], LARGE 0 d X RANGE) .............129 ix List of Figures (Cont'd) Figure Page FIGURE 3108: STATES OF SYSTEM (3.7) USING NGAC (W=[1 1], LARGE 0 d X RANGE)..............................130 FIGURE 3109: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 1], LARGE 0 d X RANGE) .............130 FIGURE 3110: STATES OF SYSTEM (3.7) USING NGAC (W=[0 1], LARGE 0 d X RANGE)..............................131 FIGURE 3111: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[0 1], LARGE 0 d X RANGE) .............131 x List of Symbols Symbol Description x system state f , g nonlinear system dynamics A, B linearized system dynamics u system control ℜn ndimensional vector space over the real line C1 space of all realvalued differentiable functions V Lyapunov function P Lyapunov function parameters K controller parameters Q quadratic performance index weighting matrix inf “infimum” of set theory sup “supremum” of set theory min γ minimum rate of convergence X 0 region of interest X region of attraction 0 d X checking set max u maximum control effort • 2norm ε distance function in the region of interest f L system Lipschitz constant E optimization parameters λ minimum eigenvalue λ maximum eigenvalue r p probability of copy c p probability of crossover m p probability of mutation s fitness bias parameter γ μ mean rate of convergence γ σ standard deviation of the rate of convergence u μ mean control effort u σ standard deviation of control effort 1 1 Introduction 1.1 Motivation 1.1.1 CLF Design Theory Linear H2 and H∞ design methods are proven tools for designing control laws for guaranteed optimal and robust performance of linear systems. Zhou (1997) & Levine, et al. (1996) provide an introduction to these subjects. Linear structure allows controller design methods applicable to all linear systems that are controllable and observable. Although nonlinear systems do not lend themselves to a single design method, many engineering systems are linearizable and can be fit into the linear H2 and H∞ design framework. The drawback is that the resulting controllers exhibit degraded performance from being robust against the nonlinearities. A more general and less conservative controller design method for nonlinear systems is the use of control Lyapunov functions (CLFs) (Khalil, 1996), (Krstic, 1995). This approach forces the “energy” of the system to decrease with time. “Energy” is represented by a positive definite function of the system states, called the CLF. As the states increase in magnitude, the CLF increases. As an example, in a mechanical system, energy increases with increasing positions and velocities, such that an appropriate CLF also increases with the positions and velocities. The CLF time derivative can then be made a function of the control input(s), such that any control law that causes the CLF to constantly decrease over time (in a closed set of state space) is guaranteed to drive the 2 states to the equilibrium point(s). If the energy of the system continuously decreases, the system must settle to an equilibrium point. Lyapunov stability theory can be used to find a satisfactory control law from two broad approaches (Kokotovic & Arcak, 2001): 1. A control law is selected and a search for a Lyapunov function is conducted to prove the performance of the closed loop system, 2. A control Lyapunov function is selected, from which a control law is derived that yields certain performance measures. The difference between the two approaches is that the first relies primarily upon analysis, while the second relies on synthesis. Specifically, in the first approach the control law is fixed, such that the performance measures are invariant to the selection of the Lyapunov function. Hence the Lyapunov function acts merely as a tool to guarantee stability, and sometimes estimate the performance measures. In the second approach, the control law arises from guaranteed performance measures of the CLF (i.e. rate of convergence and region of attraction). The problem is that the CLF may not be a good selection for determining the desired performance measure because the CLF topology and parameter values are incapable of yielding “globally optimal” performance, where “global” means the set of all possible CLF’s. The work of Johansen (2000, a & b) offers a computational procedure for generating nonquadratic Lyapunov functions which can be used to estimate the performance of smooth nonlinear systems. The work shows that any Lyapunov function may be represented to arbitrary accuracy by a sufficiently large finite summation of quadratic functions weighted by smooth switching functions. Johansen’s work indirectly supports the need to appropriately tune a CLF so that an accurate estimate of the performance of the system may be obtained and used by the optimization algorithm. 3 1.1.2 CLF Design Examples The manner in which the system states and control signal settle to equilibrium is the focus herein and will be referred to as the performance measure of the pair, control law and CLF. Typical performance measures include the RMS and maximum values of the system states and control signals, the rate of convergence of the system, and the maximum region of attraction, loosely defined as the set of points in state space such that the state returns to equilibrium (defined precisely below). Because of the dependence of the CLF on the control signals, the performance measures depend on both the selection of the CLF and the control law. In the work that follows, we first select the CLF and control law functions and use a numerical method for tuning the parameters of these functions to satisfy the performance requirements. We investigate the simultaneous numerical tuning of both the CLF and the control law parameters. By allowing both entities to vary when searching for a solution, shortcomings of other methods may be overcome. Henceforth, the resulting CLF and control law will correspond to the pair (P,K) where P stands for the vector of parameters of the CLF and K for the vector of parameters of the control law. In some situations, P and K may not be independent of each other due to the selection of CLF and controller topologies (e.g. K BT P 2 = − 1 , B∈ℜnx1 , P∈ℜnxn ). To demonstrate the utility for this approach, consider as an example a parameterized 2nd order linear time invariant system given by the equation , 0 1 0 1 0 1 > ⎥⎦ ⎤ ⎢⎣ ⎡ + ⎥⎦ ⎤ ⎢⎣ ⎡ − − = x u a a x& , [ ] 2 1 1 2 x = x x T ∈ℜ x , u ∈ℜ (1.1) 4 where dot notation indicates time differentiation. We wish to use CLF theory to design a stabilizing control, u, of the states, x. We select a parameterized form of the CLF, V(x), given by , 0 1 0 1 0 ) , ( < < ⎥⎦ ⎤ ⎢⎣ ⎡ − = x b b b V x b xT (1.2) We use this form of CLF so that we may vary the influence of 1 x and 2 x on the CLF. Suppose we seek to find a control such that 2 2 V& (x) = −x (1.3) Computing V& yields V x a b b x x a b x b x u 2 2 1 2 2 & ( , , ) = 2(2 −1) + 2 ( −1) + 2(1− ) (1.4) It is straightforward to show that (1.3) can be satisfied by selecting u as 1 2 2 1 1 ( , , ) 2 1 ab a x b x b a x u ⎟⎠ ⎞ ⎜⎝ ⎛ − + − = + (1.5) The pair (P,K) becomes (b, [a b]). Now suppose we desire to minimize the control effort inside the region defined by { : [ 1 1], [ 1 1]} 1 2 S = x∈ℜ2 x ∈ − x ∈ − (a unit square centered at the origin in statespace) by varying the free parameter b. Considering a fixed, a suitable performance measure could be defined as ∫ ∫ − − = 1 1 1 1 1 2 J (b) u(x, a,b)2 dx dx (1.6) in which we wish to minimize the control effort in S. It may be shown for the ranges of a and b, the minimum of (1.6) exists at the point ⎭ ⎬ ⎫ ⎩ ⎨ ⎧ * = ∈ℜ: = 0 db b b dJ . (1.7) The derivative of J with respect to b is 5 ( ) ( 1)3 2 1 1 3 2 − − + = − b a b db dJ (1.8) In this special case, solving for b* in (1.7) yields a b 2 * = 1− 1 (1.9) We see that the selection of the CLF in (1.2) has an effect on the optimality of the resulting control. Our example minimized the control effort in a region of state space. Had we selected an arbitrary value for b, other than b*of (1.9), a suboptimal controller would have resulted. In addition, 2 a ≤ 1 impliesb* ≤ 0 , violating the requirement 0 < b < 1, hence 1.3 cannot be satisfied if 2 a ≤ 1 . Now consider a similar investigation of a 2nd order nonlinear time invariant system. The system equations are x& = f (x) + g(x)u, f x = [x ]T g x = [ ]T u ∈ℜ ( ) 3 0 , ( ) 0 1 , 1 (1.10) In this example, we begin by selecting Sontag’s Universal Formula (Sontag, 1989) as the control law ⎪ ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎧ = ∂ ∂ ≠ ∂ ∂ ∂ ∂ ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ + ⎟⎠⎞ ⎜⎝ ⎛ ∂ ∂ + ∂ ∂ − = 0 0 0 2 4 g x if V g x if V g x V g x f V x f V x V u , (1.11) where V :ℜ2 →ℜ . Assuming a CLF, V(x), exists, the control law of (1.11) is a smooth globally asymptotic stabilizing control for (1.10) because it forces the state trajectories of (1.10) to follow the 6 gradient of the CLF. A problem with such a control law is the “ g x V ∂ ∂ ” term in the denominator may cause u to become very large. To investigate the role of the CLF, we assume the quadratic function in , 0 1, 0 / 2 sin cos cos sin 0 1 0 ( , , ) θ π θ θ θ θ θ < ≤ < < ⎥⎦ ⎤ ⎢⎣ ⎡ − ⎥⎦ ⎤ ⎢⎣ ⎡ − = x b b b V x b xT (1.12) is a “local CLF” (specifically defined in the next section). Equation (1.12) is general quadratic function similar to (1.2), except that we introduce a new parameter,θ , which rotates the eigenvectors about the origin. The local CLF gradient is ⎥⎦ ⎤ ⎢⎣ ⎡ − − − = ∂ ∂ θ θ θ θ (1 )sin (1 ) cos cos sin b b b b x x V T (1.13) The specific expression for the control law (1.11) now becomes ( ) ( ) ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎨ ⎧ − − = − − ≠ − − + − + − + − + + − = 0 sin (1 )cos 0 sin (1 )cos 0 sin (1 )cos sin (1 )cos cos (1 )sin cos (1 )sin 1 2 1 2 1 2 4 1 2 2 2 3 1 4 1 2 3 1 4 1 θ θ θ θ θ θ θ θ θ θ θ θ f x b x b if x b x b x b x b x b x b x b x x b x b x x b u (1.14) Ensuring that (1.14) does not grow too large might involve minimizing the maximum control effort on some set S by tuning b and θ . Since a closed form optimal solution probably does not exist or is at least unknown, u would have to be computed and evaluated at all the points in S for every iteration of the optimization algorithm. Even for such a relatively simple system and control law, we clearly have a multimodal optimization problem which requires many function evaluations. It is much more desirable to find an optimization algorithm that scales nicely with problem complexity. 7 1.2 Definitions, Problem Statement, and Background Theory 1.2.1 Scope of Systems and CLF’s In the following, we assume differentiable timeinvariant nonlinear systems and controls of the form x& = f (x)+ g(x)u(x,K) (1.15) where x∈ℜn is the ndimensional state vector, K ∈ℜp is the pdimensional control gain vector, u :ℜn+ p →ℜ is the control input, f , g :ℜn →ℜn are the system dynamics, and u, f , g ∈C1 , f(0)=0, where C1 is the space of all differentiable functions. Chapter 2 discusses the specific control laws selected for the study. We restrict the analysis to quadratic Lyapunov functions of the form V (x) = xT Px (1.16) where P∈ℜnxn , P = PT > 0 . The optimization parameters shall be represented by the pair (P,K). Adapted from (Sontag, 1989), a local CLF is defined in this work as a smooth, proper, and positive definite function V :ℜn →ℜ, such that { } inf ( , ) 0 , 0 0 < ∈ℜ ∈ − V x u u x X & , where X 0 ⊂ ℜn ∪{0} (1.17) is the region of interest. The region of interest is defined as a subset of statespace where the controller is optimized that contains a neighborhood of the origin and only one equilibrium point. We shall use “local CLF” and “CLF” interchangeably from this point on in the discussion. 8 1.2.2 Performance Measures The performance measures considered herein are: 1) minimum rate of convergence, min γ ; 2) region of attraction, X ; and 3) maximum control effort, max u . These performance measures are defined below for all x in a region of statespace, X 0 . The minimum rate of convergence, min γ , is defined by the expression (Johansen, 2000 a) ( ) (0) , 0, 0 min 2 1 2 min ≤ > ≥ − x e t c x t c t γ γ (1.18) where, • denotes the 2norm, ( ) , 2 1 V x ≥ c x ( ) ( ), min V& x ≤ −γ V x and c = λ (P) 2 the magnitude of the largest eigenvalue of P. This performance measure means that the norm of the states will converge no slower than an exponential decay with time constant min 2 /γ , and is based on uniform exponential stability. The region of attraction, X, is defined as (Johansen, 2000 a) ( ) ( )⎭ ⎬ ⎫ ⎩ ⎨ ⎧ = ∈ ≤ ∈∂ ξ ξ X x X V x V X 0 0 inf (1.19) where inf denotes “infimum” of set theory and ∂X 0 represents the border of the set X 0 . This definition is based on the fact that the trajectory of the system cannot cross a levelset a Ω of the Lyapunov function, Ω = { ∈ 0 ( ) =α ,α > 0} α x X V x , because the Lyapunov function is a decreasing function of time. The value for α in this case isα (ξ ) ξ V X 0 inf ∈∂ = . Maximum control effort, max u , will be defined as u u x∈X = sup max (1.20) 9 where u is the output of the controller and sup denotes “supremum” of set theory. The utility of this performance measure is that, along with the minimum region of attraction performance measure (1.19), it may be used to design controllers for systems with sensor and actuator saturations by ensuring the saturation levels are avoided. 1.2.3 Problem Statement The objective of this thesis is to demonstrate the effectiveness of tuning both the control law and the local CLF simultaneously to maximize the rate of convergence and minimize the control effort of nonlinear systems. The controller design intent is to find a controller tune specific to the nonlinear system that is less conservative than the tune based on robust linear systems theory. Two control laws, defined in Chapter 2, will be tested: 1. an LQR fullstate feedback control law, and 2. a Sontaglike nonlinear fullstate feedback control law. It is assumed that a quadratic Lyapunov function is a local CLF for the nonlinear systems studied herein. The proposed solution method does not offer a strict guarantee on performance level. However, it is suggested that with enough random performance sampling, the system will achieve the estimated performance level with sufficiently high confidence, making the proposed method a practical solution for realworld controller design. 1.2.4 Literature Review Convex optimization techniques were used in Johansen (2000, a & b) to numerically compute a generalized nonquadratic Lyapunov function for Lipschitz nonlinear systems. The work was based solely on the use of Lyapunov functions as tools to measure the performance of smooth (locally Lipschitz) nonlinear systems. Extensions to using the method for controller design were not addressed, however. Ghaoui & 10 Balakrishnan (1994) proposed the socalled “VK” iteration technique for linear control systems (analogous to the “DK” iteration of musynthesis (Zhou, 1997)), where the control gains, K, are held fixed and the quadratic Lyapunov matrix, P, is found using LMI techniques. This minimizes the time derivative of the Lyapunov function, and then P is held fixed while K is used to minimize the time derivative of the Lyapunov function. Although Johansen’s work was not used for controller design, an extension is fairly straightforward. While not covered herein, an interesting avenue of future research would be to replace the quadratic Lyapunov functions and linear control laws of Ghaoui & Balakrishnans’ VK iteration method with the general nonquadratic Lyapunov functions of Johansen’s method, then employ control laws linearintheparameters to extend the VK iteration method to Lipschitz nonlinear systems that are affine in the control law. Johansen’s method exploits the structure of the selected Lyapunov function, which is linearintheparameters. If a control affine system and a linearintheparameters control law were assumed, the controller parameters would also show up linearly in V& , so that a convex optimization method could be used to alternately tune both the CLF and controller parameters. 1.2.5 Restrictions of Current Methods Ghaoui & Balakrishnans’ method is restricted to quadratic Lyapunov functions and linear systems. The possible extension to Johansen’s method outlined above is restricted to affine in the control nonlinear systems with linearintheparameters control laws. In addition, these methods require that the CLF and the control law be tuned separately. Fixing one set of parameters and tuning another is a restriction in the optimization method, which may hinder the parameter search. Considering the 11 optimization mechanism known as “hill climbing”, holding some parameters constant while varying others is analogous to traveling up the hill in alternating orthogonal directions, whereas, tuning parameters simultaneously is analogous to following the gradient all the way up the hill. Following the gradient can be a more efficient search mode because it is instantaneously the fastest way to increase elevation. The approach taken herein exploits the interaction between the CLF and the control law as they are both varied simultaneously, with the intention of finding better controllers and/or a better optimization method. 1.2.6 Checking Performance Measures for Nonlinear Systems The performance measures must be met everywhere in X 0 . However, checking every point is impossible because X 0 , although a compact set, is a dense uncountably infinite set of points. We must therefore find a way to sample X 0 and check the performance measures on a discrete finite subset, 0 d X , while guaranteeing that the points between the “checking points” of 0 d X also satisfy the requirements. The following is taken from “Theorem 2” of Johansen (2000 a) and may be used to guarantee the Lyapunov conditions are met for all x∈ X 0 given that they are satisfied for all 0 d x∈ X . We first present some preliminary definitions used in the theorem. Define the checking set density function ε (x) by d x X x x x d d = − ∈ 0 ε ( ) inf (1.21) which is the distance from some point in the region of interest, x∈ X 0 , to the closest neighbor in the checking set, 0 d d x ∈ X . Define the Lipschitz constant for f as f L . The size 12 parameter will be defined as sup ( , ) 0 , S f x k x∈X k∈Κ = , where Κ is the set of admissible values for the control gains, P = λ (P), and X x y x y X = − , ∈ 0 0 sup . Theorem 1.1 (adapted from Theorem 2 of Johansen (2000 a)) Suppose X 0 is a compact set, V (x) > 0 for all x∈ X 0 − {0} and f is a bounded, locally Lipschitz function. Let γ > 0 be given and suppose there exists an α > γ > 0 such that for all 0 d x∈ X V& (x) ≤ −αV(x) (1.22) Assume the checking set grid density is such that ( ) Q (x) V(x) min ε ≤ α −γ (1.23) where Q 2SP 2X 0PL 2 PX 0 f = + + α (1.24) Then for all x∈ X 0 ( ) ( ) min V& x ≤ −γ V x (1.25) Proof: (see Johansen 2000a) Johansen’s theorem allows us to ensure stability for the points not sampled. The theorem suggests that if the state space is sampled fine enough, then an accurate estimate of min γ may be achieved. The required “closeness” of the sampling points is related through Q , a measure of how fast the system states change (proportional to the system Lipschitz constant). By (1.22), the size of min γ in (1.25) is directly related toα , ε , V, and Q. We note, however, that computing Q is very difficult in practice because for highly 13 nonlinear systems, finding f L can be very difficult. In addition, both P and f L are variables in the search for (P,K). Therefore any P and f L values used initially that satisfy (1.23) might change enough during the search to invalidate (1.23), causing the need to refine the checking grid and possibly invalidate the progress made by the search algorithm. We therefore propose a more practical method to fix ε (x) = ε ,ε > 0 and estimate min γ via random sampling of X 0 (described in Chapter 2). The estimation of min γ of course occurs after a (P,K) pair is found that satisfies (1.22). Finding a positive value of min γ does not imply that the performance measures of (1.17)(1.19) have been met. However, it does imply the satisfaction of performance measures of the form defined by (1.18)(1.20), in the following ways: 1) a positive value of min γ is precisely a minimum rate of convergence (1.18); 2) min γ is defined on some region of X 0 containing the origin which must contain a connected level set of the Lyapunov function whose interior’s minimum rate of convergence is determined by min γ in (1.18), assuring the region X 0 contains a minimum region of attraction (1.19); and 3) u has an upper bound on X since u is differentiable and X is bounded. In other words, finding a (P,K) pair which yields 0 min γ > means that on X 0 , a decay rate, a region of attraction, and a maximum control effort exists, but not necessarily satisfying the desired amounts. 14 A Method for Relaxing the Checking Grid Density Requirements Around the Origin Theorem 1.2 Suppose we have a system of the form (1.15) and CLF of the form (1.16). Let X 0 be a compact set containing the origin and, without loss of generality, assume the system is locally stable and u=0. Define =0 ∂ ∂ = x x A f and { } V V x X − & = ∈ − 0 min 0 γ inf (1.26) { } ( ) x Px x A P PA x T T T x X L − + = ∈ − 0 min 0 γ inf (1.27) Then as sup 0 , 0 0 = − → ∈ X x y x y X , L min min γ →γ (1.28) Proof: The Taylor series expansion of f about the origin is f (x) = Ax + D(x) (1.29) where D(x) represents the higher order terms of the expansion. The time derivative of V becomes = V V& ( ) x Px x A P PA x x PD x V D x x Ax V x V T ( ) − T T + − 2 T ( ) ∂ = ∂ + ∂ ∂ (1.30) decomposing min γ into two parts { } ( ) ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ − + − + = ∈ − x Px x PD x x Px x A P PA x T T T T T x X inf 2 ( ) 0 min 0 γ (1.31) and using the facts that inf ( ) inf ( ) , , a b a b a b a b + ≤ + and a b a b a b a inf ( + ) ≤ inf ( ) + , , and the definition of L min γ (1.27) one may arrive at the inequality 15 { } ( ) { } ( ) { } ( ) x Px x PD x x Px x PD x x Px x A P PA x x Px x PD x x Px x A P PA x x Px x PD x x Px x A P PA x T T T L T T T T x X T T T T T x X T T T T T x X inf 2 ( ) 2 ( ) inf 2 ( ) inf 2 ( ) 0 0 0 min 0 0 0 + = + ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ − + ≤ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + − + ≤ ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ − + − + = ∈ − ∈ − ∈ − γ γ (1.32) The inequality (1.32) implies x Px x PD x T T L 2 ( ) min min γ −γ ≤ (1.33) Since the lowest order terms in the polynomial D(x) are quadratic, the lowest order terms in the polynomial “ xT PD(x) ” are cubic ensuring that as sup 0 , 0 0 = − → ∈ X x y x y X , ( ) →0 x Px x PD x T T implying L min min γ →γ . ■ Because of the large number of sample points usually required to obtain a reliable estimate of min γ , it is more efficient to compute the eigenvalues of P and AT P + PA and instead use the bounding relationship ( ) ( ) L T P A P PA min γ λ λ ≤ − + (1.34) By Theorem 1.2, for a sufficiently small region about the origin, the difference between min γ and L min γ is small, making the left side of (1.34) a good estimate of min γ for points near the origin and dramatically reducing the required size of the checking set 0 d X . 1.3 Thesis Outline Chapter 1 motivates the search for an optimization method for tuning CLF’s and their corresponding control laws. Chapter 2 poses the specific optimization problem and 16 describes the specific genetic algorithm used to solve the optimization problem. Chapter 3 provides examples of how to use the genetic algorithm of Chapter 2 to tune full state feedback CLF controllers. Chapter 4 provides an outline of future research that explores the addition of uncertainty and adaptation in the control systems, and also addresses the use of more generalized CLFs, as well as discrete time control. 17 2 A Genetic Algorithm for CLF Optimization 2.1 Genetic Algorithm Motivation Genetic algorithms (GA’s) are ideal candidates for a general optimization method for solving the dual tuning problem described in chapter 1. The reader is referred to Flemming & Purshouse (2002) for an excellent survey on evolutionary algorithms in control engineering. GA’s are parallel global stochastic search algorithms that do not depend on derivatives to perform the optimization – their most attractive feature. The stochastic nature of the algorithm allows it to make random jumps into hardtoreach regions of search space that may contain a local extrema. These regions would otherwise be inaccessible using only local gradient information. Metaphorically speaking, they are capable of jumping over valleys onto other mountains in search of the highest peak. The parallel search feature emerges from the population of search points being spreadout among the parameter search space. This allows the simultaneous exploration of several regions in search space containing local maxima. Nondifferentiable fitness functions may be used as the optimization objective function because the search movements are not based on gradients, but occur from either random jumps called mutations or selective combinations of two or more highly fit individuals called crossovers. The crossovers effectively serve as interpolations and extrapolations of the existing search points in the GA population. The GA is capable of making fast progress in the parameter search effort for difficult problems. However, unlike gradientbased methods, GA’s lack a guarantee of 18 making steady progress towards a local maxima. A remedy to this problem is decreasing the mutation step size. This mimics the small incremental progress made by a gradient method because a small random jump in some situations is likely to have a positive projection upon the gradient direction so that a step in the right direction is made. Gradient and other methods that use only local objective function information are not attractive here for three major reasons: (1) the multimodal nature of the objective function leads to convergence to local extrema; (2) the objective function evaluation points drift in time (explained below), giving the objective function a time varying nature which calls for a variable step rate to balance numerical stability with making fast enough progress, adding an unnecessary degree of difficulty to the problem; and (3) the nondifferentiability of an objective function is incompatible with a gradient method calling for the use of a finite difference approximation to the gradient which can be inefficient for a large parameter space (many function evaluations must be made just to move a small amount in parameter space). A general rule of thumb in ensuring GA’s make sufficient progress is that the population have sufficient size and time. This typically makes them inefficient for searching low dimensional parameter spaces. However, outweighing this drawback is their ability to make fast progress in large parameter search spaces, as well as their ability to optimize both the parameters and topology of functions, as with genetic programming, a subset of genetic algorithms (Koza et al., 1999). 2.2 Tailoring to the Specific Problem 2.2.1 Optimization Objectives The objective of the optimization algorithm is to find a (P,K) pair that satisfies the exponential stability conditions 19 ( ) 0 2 V x ≥ v x v > (2.1) and ( ) ( ) 0 min min V& x ≤ −γ V x γ > (2.2) for all sampled points 0 d x∈ X . In addition to (2.1) and (2.2), we also consider the maximum allowable control effort on the sample set labeled max u , u u x Xd0 sup max ∈ ≥ (2.3) We define an admissible (P,K) pair as the set of parameter values that make the system (1.15) and Lyapunov function (1.16) satisfy (2.1) , (2.2), and (2.3) for some user specified triple ( ) min max v,γ , u on the checking set 0 d X . Typically, desired values of ( ) min max v,γ , u are not known, so they are allowed to “float” with progressive improvement during the optimization, and the user decides if the values obtained are good enough at the end of the optimization run. 2.2.2 Simplifying the Search We restrict our investigation to quadratic Lyapunov functions. Due to computational restrictions and the use of Theorem 1.2 to reduce the required size of 0 d X , we also restrict the design optimization to closed convex sets about the origin. To add an additional degree of local robustness and optimality to the closed loop nonlinear system, we also assume the CLF is locally 2 H inverse optimal (Kokotovic & Arcak, 2001). That is, the CLF matrix P is the solution to the LQR problem for the linearized system for some set of states and control effort weighting matrices in the objective function. The 20 problem reduces to using the GA to find E ∈ℜnxn such that Q = ET E + cI , c ≥ 0 (2.4) so that λ (Q) ≥ c . We compute a P = PT > 0 that satisfies the HamiltonJacobiBellman (HJB) equation (also known as the Algebraic Riccati Equation) for linear time invariant systems (Levine et al., 1996) ATP + PA − PBBTP + Q = 0 (2.5) The solution to (2.5) satisfies the minimization of the performance index ∫∞ = + t J xTQx u2dt when we assign the control as u = −BT Px (2.6) We thus use E to generate the CLF V = xTPx . An additional utility to the approach is the direct computation of a full state feedback linear control (2.6), if such a control law is desired. One may argue that selecting (2.6) for the control is like using the gradient direction of V normalized to satisfy the HJB equation (Primbs, et al 2000). Therefore, similar to the argument presented in Theorem 1.2, any control law that approaches u = −BT Px as x approaches the origin will also be a locally optimal controller for the nonlinear system. In addition, because we sample the performance index for the nonlinear system on the set 0 d X , the controller shall be optimal on 0 d X . In summary, we attempt to maximize the rate of convergence and minimize the control effort on 0 d X using controllers whose linearization is inverse optimal for the linearized system dynamics. A more general version of the control (2.6) based on the proposed control of Primbs, et al (2000) is 21 ⎪ ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎧ = ∂ ∂ ≠ ∂ ∂ ∂ ∂ ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ + ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ + ∂ ∂ − = 0 0 0 ( ) 2 2 g x V g x V g x V g x f q x V x f V x V u (2.7) The control is a modified version of Sontag’s Universal Formula (Sontag, 1989) for nonlinear systems that are single input and affine in the control. The performance index minimized by (2.7) has the form ∫∞ = + 0 J q(x) u2dt , q :ℜn →ℜ, q(x) > 0,∀x ≠ 0,q(0) = 0 , and arises from the solution of a more general version of the HJB equation (Primbs, et al 2000) ( ) 0 4 1 2 = + ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ − ∂ ∂ g q x x f V x V (2.8) Notice (2.7) and (2.8) are equivalent to (2.6) and (2.5), respectively, for the case of LTI systems. Using the control of (2.7) with a quadratic CLF and the linearized dynamics of the nonlinear system reduces to the LQR control of (2.6). Therefore (2.7) combined with a quadratic CLF and the nonlinear system dynamics yields globally asymptotically stable dynamics with local optimality. In this work, we shall consider (2.6) only for linear controllers and (2.7) only for nonlinear controllers. 2.3 Genetic Algorithm Description The objective of the genetic algorithm is to maximize the fitness function which is used to measure how well the controller performs on X 0 . We choose to maximize the minimum rate of convergence while minimizing the maximum control effort on the discrete checking set 0 d X . Therefore, the fitness function is defined as 22 ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − − + + ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − + = u u u iu i i w w w w w Fitness E w φ φ φ φ φ φ φ φ γ γ γ γ ( ) 1 1 2 2 1 2 1 , , 0 1 2 w w > (2.13) where ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ = − ∈ ( , ) ( , ) min0 i d i d x X i V x E V x E d d & γ φ , ( ( i )) d x X iu u x E d d max , ∈ 0 φ = , i γ i γ φ = minφ , i i γ γ φ = maxφ , iu u i φ φ min = , iu u i φ = maxφ . The Ei is left in the function arguments to remind the reader that the ith CLF and control are dependent upon the ith matrix Ei defined by (2.4) in the population. The two major components of the fitness function, ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − γ γ γ γ φ φ φ i φ and ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − − u u u iu φ φ φ φ 1 , are considered subfitness functions and are used to normalize the rate of convergence and control effort to eliminate numerical problems due to scale mismatch between these two quantities. The selection process tends to select species good at achieving both high rate of convergence and low control effort. The weights, w, are used to emphasize more selection pressure towards a high rate of convergence, γ φ , or low control effort, u φ . They may be left as constants or changed dynamically as the algorithm progresses. In fact, it was found that pulsing the w’s dynamically helps control the composition of the GA population so that members of the population or species that satisfy one subfitness function well are combined with those that satisfy the other subfitness function well, speeding up the search process for species that do both tasks well. Future work could quantify this improvement. To aid in understanding the interdependencies of the variables involved with the fitness function, the diagram (Figure 21) below illustrates the computational flow of the fitness function. 23 u V V E Q P or V x Px V Vf x u Q E E cI A P PA PBB P Q T T T T (2.7) (2.8) ( , ) 0 → → → → → = =∇ = + + − + = & & ( ( ) ) ( ) max min max min max , ( , ) ( , ) min 0 0 Fitness E u u iu u x E i V x E V x E iu i iu i i i i i i d xd X d i d i d xd X d → → → → → → → ∈ ∈ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − φ φ φ φ φ φ φ φ γ φ γ φ γ γ γ & Figure 21: Flow of Fitness Computation The genetic algorithm used in this work, employs the basic evolutionary operators “copy”, “crossover”, and “mutate”. They are defined as Copy(E) = E (2.9) Crossover(E1,E2 ) = βE1 + (1− β )E2 (2.10) whereβ is a uniformly distributed random number on the interval [0 2]. E Mutate(E) = E + Δ (2.11) where E Δ is a uniformly distributed random matrix with dimensions matching those of E. The elements of E Δ lie on the interval [ ] E E −σ σ where E σ is the user defined mutation step size of E. The Copy operator (also known as reproduction in the genetic algorithm literature) is used to preserve well fit individuals in the population so that they may be used in future generations for generating better individuals via the Crossover or Mutation operators. The Crossover operator takes two highly fit individuals and either creates a new individual that interpolates between the two when β < 1 or extrapolates along the line connecting the two individuals in E search space whenβ > 1. The Mutation operator takes a single highly fit individual and creates a new individual via randomly perturbing its elements about the hypercube of width E Δ . 24 To demonstrate the simplicity of imposing constraints on the GA optimization and to further support the use of a genetic algorithm, we place an additional constraint on the control gains. All the elements of the LQR gain vector of (2.6) must be limited in magnitude by some upper bound, max K . This constrains the control effort as well as focuses the search to areas in parameter space that yield physically realizable control signals. For the nonlinear control law (2.7) this upper bound affects the gains of the linearized controller. By (2.4) and (2.5), decreasing the norm of E will decrease the norm of K. Therefore we select the following filter on E to ensure the maximum gain constraint is satisfied: ⎩ ⎨ ⎧ > = = E otherwise E K K Gain Limit Filter E i 1,2 n i max max _ _ ( ) K η , 0 <η < 1 (2.12) The genetic algorithm uses this adhoc filter to update E such that controller gain limitations are enforced: if a gain is too high, then E is scaled down, indirectly scaling down the entire gain vector. The GA performs a parallel search of the parameter search space (Jamshidi, et. al). That is, rather than search via a single point at a time, as with typical optimization methods, an entire set of points (i.e. the population) are considered at once. Members of the population are randomly selected for evolutionary operation to create the next population (or to move the population as a whole though the search space). To bias the population motion towards “progress”, highly fit individuals are more likely to be chosen for evolutionary operation. The members of the population are first arranged by fitness in ascending order. The individuals with the highest fitness are placed at the beginning of the ordered list, denoted { } fit pop I ⊂ 1,2,K, N , where pop N denotes the size of the 25 population. The probability density function (pdf) for the selection of the ith member of fit I is ( ) , 0 1 > ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ∈ = Σ= s N j N i P I Selected Npop j s pop s i pop fit (2.14) where s acts as a fitness bias parameter. The form of the pdf of (2.14) was selected for the following reason. By changing s, we may adjust the size of the most likely portion of the population we use during the selection process. The higher s becomes, the more we restrict selection to highly fit individuals. For example, if s = 0 then we have a uniform probability of selecting any member of fit I . If s is very large, the only member of fit I likely selected is the individual with the highest fitness, i.e., the first member in the ordered list fit I . Once a species is selected, a random decision on which operation to perform must be made. The probability for Copy, Crossover, and Mutation are labeled r p , c p , and m p , respectively. The restrictions for these parameters are 1 , , 0 + + = ≥ r c m r c m p p p p p p (2.15) Typically we insert the best fit individual into the next generation and set = 0 r p . This is because two copies of the same individual are often selected for the Crossover operation, which results in another copy of the two same individuals, implicitly implementing the Copy function. The usual settings for the evolutionary operation probabilities in this work are = 0, = = 0.5 r c m p p p . 26 2.3.1 Making 0 d X Dynamic The required size of the checking set 0 d X grows exponentially with the dimension of the system state, hence the need to strategically assign 0 d X such that a minimal number of points is used. The objective is to maximize the minimum value of −V& /V on 0 d X , i.e. maximize min γ , and to minimize the maximum value of u on 0 d X , i.e. minimize max u . For all practical purposes, if we knew where in X 0 these max/min and min/max conditions occurred, we could define 0 d X as the set of these points, denoted critical points, so that the fitness function is evaluated only at the points that matter, i.e. the critical points. The argument is that we don’t have to check all the points in X 0 , just the critical points. Unfortunately, the critical points for each species in the population are unknown and are also functions of (P,K), hence functions of E and change after every generation. The clusters of E’s in any population are perturbations of an average E for that cluster, so taking the critical points for each E on the set 0 d X is a way to create a perturbation cluster of critical points labeled d C . The set d C represents the best estimation of where the population of controllers (i.e. the E’s) are yielding the minimum rate of convergence and maximum control effort on X 0 . Therefore, it makes sense to evaluate the controllers of the next generation at these points. Hence, at the beginning of every generation, we redefine the checking set as d d d X 0 = C + R : the critical points from the last generation ( d C ) plus a set of randomly selected new points, d R , that act as “exploration points” for finding better critical points for the current population. Better 27 means yielding a lower fitness value than any point in d C . This method reduces the needed size of 0 d X and increases the speed of the GA. 2.4 Stochastic Estimation of min γ Once a satisfactory (P,K) pair is found, the stochastic checking method is used. The sampling method used to estimate the value of min γ (1.18) is based on the Chebyshev inequality of probability theory (Resnick, 1999)(Stark & Woods, 1994). We assume the probability density function (pdf) of γ is a uniformly distributed random variable, Γ , with mean, Γ μ , and variance, 2 Γ σ . The pdf for Γ is ⎪⎩ ⎪⎨ ⎧ − ≤ ≤ + = Γ Γ Γ Γ Γ Γ otherwise z p z 0 3 3 2 3 1 ( ) μ σ μ σ σ (2.16) This type of pdf is chosen because it is reasonable to assume we know only that the distribution of Γ is bounded from above and below, when we uniformly sample it on X 0 . Future work could use a better estimate of the distribution of Γ by using the actual function V V& − , since it is a known function of x. The Chebyshev inequality may be expressed as [ ] 2 2 ˆ ε σ μ μ ε n P Γ Γ Γ − ≥ ≤ (2.17) where P(A) denotes probability of event A and Σ= Γ = Γ n i n i 1 μˆ 1 (2.18) 28 is the maximum likelihood estimate of Γ μ (Stark & Woods, 1994), i.e. the sample average of Γ . If we knew the value of Γ σ then we could use (2.17) to specify a probable lower bound on γ , which from the pdf (2.16) would be ( ) Γ Γ ∈ γ = inf γ = μˆ −ε − 3σ min 0 x X (2.19) The lower bound probability of (2.17), denoted γ min P , is a free parameter which may be specified by setting n large enough in (2.17), i.e. 2 min ε σ γ P n ≥ Γ (2.20) We do not have the luxury of knowing Γ σ , however, such that an upper bound of Γ σ must be estimated. The maximum likelihood estimate of Γ σ (Stark & Woods, 1994), denoted Γ σˆ , is given by ( ) Σ= Γ Γ = Γ − n i n i 1 σˆ 1 μˆ 2 (2.21) along with its uncertainty, uσˆ , may be used to express the total value of Γ σ , given by σ σ σ ˆ = ˆ + u Γ Γ (2.22) By Taylor series expansion of continuous functions, an approximation for the bound of uσˆ (for sufficiently smallε ) is ε μ σ σ Γ Γ ∂ ∂ ≤ ˆ ˆ u ˆ (2.23) where the term “ Γ Γ ∂ ∂ μ σ ˆ ˆ ” may be thought of as the sensitivity of Γ σˆ to the Γ μˆ estimate, and ε as the uncertainty of Γ μˆ . The absolute value of Γ Γ ∂ ∂ μ σ ˆ ˆ is used to make a conservative 29 estimate of Γ σ . Computing Γ Γ ∂ ∂ μ σ ˆ ˆ and substituting it, along with the right sides of (2.21) – (2.23) into (2.20) yields ( ) ( ) 2 1 1 2 2 min ˆ ˆ ε μ ε μ γ P n n i i n i i Σ Σ = Γ = Γ Γ − + Γ − ≥ (2.24) The result is a transcendental inequality, which when satisfied, guarantees that for small enough ε , ( ) ( )⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ = − − Σ Γ − + Σ Γ − = Γ = Γ Γ n i i n i n i 1 1 2 min γ μˆ ε 3 μˆ ε μˆ (2.25) with probability min 1 γ − P . To solve (2.24) we specify the desired values of ε and γ min P , and guess a value for n. The right side of (2.24) is then computed and the inequality is checked. If the left side is greater than the right side, a larger value of n is chosen. This process repeats until n is large enough. After (2.24) is satisfied, (2.25) is computed. If min γ is positive, then we have, with probability min 1 γ − P , an exponentially stable control law on the domain X ⊂ X 0 with a Lyapunov function to prove it. Equation (2.25) in practice turns out to be an overly conservative estimate of min γ , however. This is because Γ is not truly a uniformly distributed random variable. It is a nonlinear function of a uniformly distributed random variable, making Γ a nonuniformly distributed random variable (Stark & Woods, 1994). We can roughly estimate min γ directly by taking the lowest value of Γ while sampling X 0 . Therefore, a better estimate of min γ can be made by computing the uncertainty of the estimate and using it to compute a worst case min γ . Hence we define the sample minimum of Γ , min Γ given by 30 Γ = (Γ) ∈ 0 min min x Xd (2.26) and the estimate of min γ as min min γ = Γ (2.27) We have used the stochastic arguments above to derive a reasonable estimate of the size of the checking set for postGA numerical estimation of the minimum rate of convergence. We shall assume that this checking set is sufficient for checking the maximum magnitude of the control signal as well. Note that for pdf’s with substantially higher order moments (moments beyond the mean and variance), we expect this estimate to be invalid. It is, however, a good starting point for the size of the checking set and could be increased until the estimates of min γ and max u do not change substantially. Randomly sampling min γ and max u as discussed above, along with a few numerical simulations usually suffices as a confidence builder to the control engineer that the controller is indeed stable and with the allowable maximum control effort. 2.4.1 GA Flowchart The flow chart of the algorithm is given below in Figure 22. At the beginning of the algorithm, a randomly generated set of E’s is inserted into the first population (i.e. generation 0). The set of E’s is used to compute a set of (P,K) pairs, which in our case means solving the HJB equation (2.5) for P and using it to solve for K depending on the selection of the control law (2.6) or (2.7). The fitness computation of Figure 2.1 follows, and the fitness of the population is used to select the best individuals for performing the evolutionary operations defined in (2.9) through (2.11). The E with the highest fitness denoted “best E” is returned along with the corresponding (P,K) pair. The controller 31 performance, i.e. minimum rate of convergence and maximum control effort are numerically estimated by randomly sampling X 0 using the sample size defined by (2.24). If the performance is good enough then the CLF and controller are accepted as a solution to the optimization problem or used possibly in some other performance simulation. If the performance is unacceptable, the GA parameters are tuned or the controller performance requirements are relaxed and the process is repeated. In the next chapter we use the ideas outlined above to tune the linear and nonlinear controllers of (2.6) and (2.7), respectively, for a set of nonlinear systems that are relatively difficult to control. Figure 22: GA Optimization Algorithm Generate New Population of Random E’s Fitness of E’s Checking Set & Nonlinear Dynamics: 0 d X , f,g Evaluate Fitness’s Selection, Evolutionary Operations, and Gain Limits Generation # = max #? Numerical Checking of Controller Performance yes no Generate Population of Random Mutations of best E best E Beginning of algorithm ? yes no Performance Good Enough? Finished yes no Adjust GA parameters Compute (P,K) using E Linearized Dynamics: A & B 32 3 Full StateFeedback CLF Control 3.1 Selection of Benchmark Examples The purpose of this chapter is to demonstrate how to use the ideas of Chapter 2 to tune fullstate feedback CLF controllers. The reader is reminded that the discussion is limited to the two controller types of (2.6) and (2.7) relisted below: Linear (2.6): u = −BT Px (3.1) Nonlinear (2.7): ( ) ( )( ) ⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ = ≠ + + − = 0 0 0 2 2 x Pg x Pg x Pg x Pf x Pf x Qx x Pg u T T T T T T T (3.2) where P is the solution to the algebraic Ricatti equation ATP + PA − PBBTP + Q = 0 using the linearized dynamics (A,B) of the nonlinear controlaffine system defined by x& = f (x) + g(x)u , x∈ℜn ,u ∈ℜ, f , g :ℜn →ℜn , and the selected Q matrix defined by the quadratic integral performance index ∫∞ + 0 xTQx u2dt . The following example systems are taken from Ngamsom (2001). The theory developed by Ngamsom is based partially on maximizing the region of attraction (region of stability) of nonlinear systems using linear control laws. The drawback of the method proposed by Ngamsom is the controller is synthesized using an uncertain linearized model of a nonlinear system. Therefore, the theory is based on linear control of nonlinear systems, i.e. only local information about the system dynamics and worst case 33 assumptions on the effects of the nonlinearities are used. The controller design intent is to find a controller tune specific to the nonlinear system that is less conservative than the tune based on robust linear systems theory. We use the genetic algorithm described in Chapter 2 to tune linear controllers (3.1) and compare with those of Ngamsom. For nonlinear control, we use the augmented Sontag’s universal control law (3.2) from Primbs, et al (2000) because of its inherent global stability and local inverse optimality. However, we show how the genetic algorithm’s tuning of the CLF and quadratic performance index q(x)=xTQx results in using less control effort than the CLF resulting from linear dynamicsbased methods. The main reason for the selection of the nonlinear control law is that it converges to the linear control law in a sufficiently small neighborhood about the origin. This allows the P matrix of Ngamsom’s linear control method to be used directly in the nonlinear control law for an additional mode of comparison between Ngamsom’s controller and the genetic algorithm tuned CLF controller. Ngamsom’s linear control method happens to be a very effective design procedure for a robust linear controller. It is considered a good benchmark for the methods proposed in this thesis. 3.1.1 Comparing GAC to LARC To facilitate comparison between Ngamsom’s work and ours, we shall refer to the controller and CLF resulting from the method proposed herein as the “Genetic Algorithm Controller” (GAC). The GAC implementing the control law from (3.1) shall be referred to as the linear GAC (LGAC) and the GAC implementing the control law from (3.2) shall be referred to as the nonlinear GAC (NGAC). The linear controller that maximizes the region of stability is called the Lyapunov Attractive Region Controller (LARC), after 34 Ngamsom. Similar to the GAC terminology, the LARC implementing the control law from (3.1) shall be referred to as the linear LARC (LLARC) and the LARC implementing the control law from (3.2) shall be referred to as the nonlinear LARC (NLARC). Ngamsom’s design method is a systematic method for designing linear controllers for a class of nonlinear systems. The method uses the linearized dynamics of the nonlinear system along with assumptions on how the nonlinearities, treated as uncertainties, affect the linearized system. We observe that the size of the region of stability is a “sideeffect” of the LARC. That is, the region of stability of the LARC is not explicitly selected – it is an inherent result of the combined nonlinear system and linear controller. The proposed GAC method offers the ability to explicitly select the region of stability, in addition to direct tuning of both the minimum rate of convergence and maximum control effort. This freedom comes with a price, however. For all of the examples examined in this study, the region of stability for the GAC was smaller than the region of stability for the LARC overall. On the other hand, the GAC allows the flexibility to select where in state space to attempt to achieve stability, as well as the ability to vary the rate of convergence while varying the control effort in this region. 3.1.2 Displaying Results We display the performance of our example controllers in the region defined by X 0 in a set of Tables. The Tables list the estimated values for the minimum rate of convergence min γ , the mean rate of convergence γ μ , the standard deviation of the rate of convergence γ σ , the maximum control effort magnitude max u , the mean control effort u μ , and the standard deviation of control effort u σ . These quantities are estimated by uniform random sampling in X 0 . Equation (2.24) is used to determine a sufficient sample 35 size. For the second order system of the first example, we include 3D plots in X 0 of the level sets of the Lyapunov function, the value ofγ , and the locations of the estimated critical points defined as the location estimates where min γ and max u occur for every controller in the GA population. The 3D plots aid in visualizing the effects the CLF shape has on the rate of convergence and the locations of the critical points. We leave blank the rate of convergence plot where it is negative or very close to the origin where it is numerically illconditioned. The blank spots where the rate of convergence is negative depict “holes” where the system becomes unstable with respect to the specific CLF. 3.2 Example 1: Artificial System The equations of motion for the artificial 2nd order system considered by Ngamsom (2001) are: ( ) u x x x x x x x x x ⎥⎦ ⎤ ⎢⎣ ⎡ + + ⎥⎦ ⎤ ⎢⎣ ⎡ + + + − = cos 1.5 0 5 sin( ) 10 10 10sin( ) sin( ) 2 2 2 1 2 2 1 2 & 1 2 1 , [ ] ∈ℜ = ∈ℜ u x x x T 2 1 2 (3.3) The linearized dynamics about the origin are: u x x ⎥⎦ ⎤ ⎢⎣ ⎡ + ⎥⎦ ⎤ ⎢⎣ ⎡ = 2.5 0 0 1 10 10 & (3.4) Table 31 displays the results from the LARC method. The Lyapunov function matrix P, the linear control gains K, and the matrix for the quadratic performance index Q, are listed. The structure of Q in Ngamsom’s work is typically fixed as a diagonal matrix, and the norm of Q is tuned via an optimization method. One advantage of the proposed GAC method lies in its ability to find a more general Q (nonzero offdiagonal elements) to optimize the performance index. 36 P K Q 4160.8 125.6 125.6 54.4 157.49 68.244 16000 0 0 16000 Table 31: LLARC Parameters for System (3.3) (Ngamsom, 2001) Table 32 displays the randomly sampled performance of the linear LARC in Table 31 on the artificial system (3.3). The range of X 0 is selected to be [ 25 25] 1 x ∈ − and [ 100 100] 2 x ∈ − . Note the negative value for min γ , implying instability. X0 Range min γ γ μ γ σ max u u μ u σ [ ] [ 100 100] 25 25 2 1 ∈ − ∈ − x x 76.935 47.769 80.6 10709 3805.6 2515.5 Table 32: Performance of LLARC on System (3.3) The randomly sampled performance of the nonlinear LARC on the artificial system (3.3) is displayed in Table 33. The minimum rate of convergence is positive by design, and the control effort is very high. X0 Range min γ γ μ γ σ max u u μ u σ [ ] [ 100 100] 25 25 2 1 ∈ − ∈ − x x 1.0294 135.75 128.340 1.4878x107 17928 1.6251x105 Table 33: Performance of NLARC on System (3.3) Table 34 displays the randomly sampled performance of the LLARC on the linearized artificial system dynamics. This Table is used to quantify the local performance of both controllers about the origin. The Table allows the comparison between the linear and nonlinear behavior of the system and helps quantify the effects the nonlinearities have on the system performance. 37 X0 Range min γ γ μ γ σ max u u μ u σ [ ] [ 100 100] 25 25 2 1 ∈ − ∈ − x x 3.7641 91.364 93.507 10709 3796.5 2518.9 Table 34: Performance of LLARC on Linearized System (3.4) Consider the level sets of the CLF using P from Table 31 in Figure 31. The shape and orientation of the elliptical “rings” determine the path of the state trajectory when under the CLF control law. A true CLF control law forces the state trajectory to cross into successively smaller rings. The trajectory in Figure 31 actually increases the CLF value initially, yet eventually converges to the origin. This behavior explains the negative value for min γ in Table 32. The CLF is therefore only a local CLF and not a CLF on X 0 . The system is unstable with respect to the CLF, but may be locally stable for some other Lyapunov function; perhaps one whose level sets are turned slightly more counterclockwise so that the state trajectory does not cross into a higher level set. Figure 31: Level Sets of V for LLARC on System (3.3) 38 The rate of convergence of the CLF from Figure 3.1 is displayed in Figure 32. The points in X 0 where γ < 0 or points close to the origin that are numerically illconditioned due to division by a very small number, are left blank. The points whereγ < 0 are points where the trajectory crosses into larger level sets and exit X 0 , possibly never to return. We see that the region where the trajectory crosses into successively higher level sets, the rate of convergence is negative as expected. Figure 32: “Gamma” (γ ) for LLARC on System (3.3) Figure 33 is an estimated region of attraction for the controller reported in Ngamsom (2001) using Monte Carlo simulation. The shaded blocks represent regions that yield a stable trajectory when used as the region of the initial state. All other blocks represent regions that yield an unstable trajectory when used as the region of the initial state. A Lyapunov function with level sets that line up with the edge of stability in Figure 33 (where white and black boxes are in contact) would be a better selection to prove the 39 true region of stability for the artificial system (3.3) via Lyapunov Stability Theory. The values for γ here are negative, supporting the results of the Monte Carlo simulation used to generate Figure 33. The objective of the GA optimization is to make all values of γ positive and large while minimizing the amount of control effort it takes to do so. Figure 33: Stability Table from Ngamsom (2001) Figure 34 and Figure 35 display the states and control signal for the linear LARC on the artificial system (3.3) for the initial condition (10,20). The states converge shortly after 0.2s while max u reaches approximately 2700. 40 Figure 34: States of System (3.3) Using LLARC Figure 35: Control Signal of LLARC on System (3.3) Figure 36 displays the level sets of the CLF for the nonlinear LARC and the state trajectory of the artificial system (3.3) for the initial condition (10,20). The state 41 trajectory is forced to cross successively lower levels of the CLF and yields very sudden discrete “switchlike” direction changes. Such behavior is due toV& having a strict equality to a function of the states ( ( )2 ( )( )2 V& = xT Pf + xTQx xT Pg ). Viewing Figure 37, we see that the rate of convergence is positive everywhere by design of the control law. The peaks in the γ plot correspond to the fastest rates of change and they are evident in the state transition plot of Figure 38. The control law forces γ to be strictly positive so that the part of the trajectory in Figure 31 under the linear LARC that crosses into higher levels of the CLF is now steered inward toward successively lower levels of the CLF. Figure 38 and Figure 39 display the states and control signal for the (10,20) initial condition trajectory. The states do not converge much faster in the nonlinear LARC case than with the linear LARC case, however the maximum magnitude of 2 x is smaller for the nonlinear LARC because the trajectory is forced to stay inside the level set corresponding to the initial condition. The control effort is of course much larger for the nonlinear LARC. Figure 39 shows the cost of the renewed stability by using the nonlinear controller instead of the linear controller – excessively large control effort. The need for reorienting the CLF level sets to help decrease the large control effort is made evident in this example. 42 Figure 36: Level Sets of V for NLARC on System (3.3) Figure 37: “Gamma” (γ ) for NLARC on System (3.3) 43 Figure 38: States of System (3.3) Using NLARC Figure 39: Control Signal of NLARC on System (3.3) The LARC method does not directly use information about the rate of convergence nor information about the control effort of the nonlinear control system. In 44 fact, P is not explicitly treated as the matrix for the quadratic CLF,V = xT Px , but it may be viewed as such. The following question is the primary motivation for this work. Question 3.1 Can we achieve better control performance over that of Ngamsom (2001) for the nonlinear system in terms of minimum rate of convergence and maximum control effort in X 0 by using the CLF, V = xT Px , and directly checking these performance measures at all points on 0 d X to maximize the performance index defined by the fitness function (2.13)? We now attempt to answer Question 3.1 by applying the proposed GAC method to the example systems. 3.2.1 GAC Case #1: Linear GAC The genetic algorithm parameters for case #1 and the resulting GAC performance are given Table 35. This case implements linear control. The GA parameters are described in Chapter 2 and in the appendix where they are used in the Matlab code that implements the GA optimization procedure. System: Artificial (3.3) Controller Type: linear Generations: 50 Population Size: 50 Checking points: 100 Critical Points: 33 E Δ : 100 max K : 1000 c: 0 X0 Range: [ 25 25] 1 x ∈ − [ 100 100] 2 x ∈ − Table 35: GA Parameters to Optimize LGAC for System (3.3) The GA results of the optimization of the linear GAC for the artificial system (3.3) are listed in Table 36. The W vector contains the weighting factors used in the fitness function defined in (2.13) to vary the weight between rate of convergence and 45 control effort. The results in Table 36 include the three sets of weights W = [1 0], W = [1 1], and W = [0 1]. They represent 3 cases where the GA selects population members with a large min γ and ignores max u , large min γ and small max u , and lastly a small max u while ignoring min γ , respectively. The reader is reminded that ultimately, the GA is tuning Q to yield a favorable set of P and K. Therefore comparing GAC to LARC begins with comparing their respective Q matrices. As can be seen the Q’s of the linear GAC are very different than the Q’s of the linear LARC. An obvious difference is the nondiagonal terms in the set of Q’s resulting from the GA optimization. Weight Case Weights P K Q 1 W=[1 0] 2768.8 347.3 347.3 71.8 868.23 179.38 6.9845x105 1.1141x105 1.1141x105 20450 2 W=[1 1] 2993.0 352.2 352.2 62.1 880.42 155.16 715290 102800 102800 16910 3 W=[0 1] 3.9147 3.5413 3.5413 3.5308 8.8532 8.8269 0.0854 0.0452 0.0452 0.0267 Table 36 Optimized Parameters of LGAC for System (3.3) Table 37 lists the randomly sampled performance of the GAC for the 3 weight cases. As one might expect, min γ decreases and max u increases as the emphasis on minimum rate of convergence shifts to an emphasis on maximum control effort. Unfortunately, this is not always the case as we shall see in the later examples. The first two cases in Table 37 yielded a much better performance than the LARC on X 0 for the minimum rate of convergence, but not for maximum control effort. Only the third case yielded a better value for maximum control effort, however the third case yielded an unstable system and therefore is not a fair comparison to the LARC control effort. Except for the last case, the controller gains are much larger for the GAC than the LARC. The design objective of achieving a simultaneous better rate of convergence and better 46 maximum control effort over the LARC is not met. Though not explored here, it is posited that a weight case that is somewhere between cases 2 and 3 may result in a controller that achieves the design objective or at least yield a controller closer to the design objective. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 34.564 301.74 208.37 39514 13381 9303.7 2 W=[1 1] 29.737 269.47 174.48 37392 12829 8786.4 3 W=[0 1] 319.66 6.3853 76.592 1099.2 449.21 270.73 Table 37: Performance of LGAC on System (3.3) Table 38 lists the randomly sampled performance of the GA controller on the linearized system (3.4). All minimum rates of convergence are positive, as expected for the linear system, because the linear GAC is a linear quadratic regulator. From Table 38 we observe that the local rate of convergence is faster, indicating the system nonlinearities work against the controller’s effort to stabilize the system. Weights min γ γ μ γ σ max u u μ u σ W=[1 0] 76.5938 522.89 230.55 3933.4 13250 9287.8 W=[1 1] 92.0664 470.25 174.51 3730.0 12781 8801.2 W=[0 1] 0.05950 20.733 3.6937 1096.7 449.46 270.17 Table 38: Performance of LGAC on Linearized System (3.4) Figure 310, Figure 311, and Figure 312 contain plots of the level sets of the CLF with the state trajectory for initial condition (10,20), the rate of convergenceγ , and the locations of the estimated critical points for all controllers in the GA population. We see that γ > 0 on all of X 0 , i.e. there are no “holes” in the γ plot unlike that of the LARC. Many critical points exist in the top left side of Figure 312 where the value for γ is low (see Figure 311) for many members of the GA population. Had we not used the 47 critical point estimation and just selected a fixed or random grid as the checking set 0 d X , many of these points would have been missed and much of the processing time would have been spent checking irrelevant points with a highγ value or a low max u value. The number of critical points was set to 33. The placement of these points by the critical point search algorithm (Chapter 2 – “Making 0 d X Dynamic”) is displayed in Figure 312. The total number of points is 100; three times the number of critical points. The critical point searching algorithm clustered the majority of the critical points in the top left corner. It is reasonable to assume the spacing of these points is close to the spacing required to satisfy Theorem 1.1. If this is the case, then it would require many more points than the 100 search points used in 0 d X to satisfy the spacing requirements of Theorem 1.1, hence the search procedure would be much slower. The heuristic formulated in Chapter 2 for moving the search points around to find the critical points seems to be effective. Figure 310: Level Sets of V for LGAC on System (3.3) (W=[1 0]) 48 Figure 311: “Gamma” (γ ) for LGAC on System (3.3) (W=[1 0]) Figure 312: Critical Points for LGAC Population Using System (3.3) (W=[1 0]) It is apparent from Figure 38, Figure 39, Figure 313 and Figure 314 that the time of convergence is much faster for the linear GAC with W=[1 0] than the linear 49 LARC, but the control effort is higher. Such a result is acceptable because no weight was placed on the control effort during the optimization. Another notable observation is how the trajectory of Figure 310 uniformly crosses into successively lower level sets, even though the control law is linear and does not possess the restrictive effect on the state trajectories as the nonlinear controller. Figure 315 through Figure 319 are very similar to Figure 310 through Figure 314. This is expected given that the optimization yielded very similar sets of control gains. The control effort is slightly smaller because the control effort has an equal weight to the rate of convergence rather than zero in the previous case. It is interesting how sensitive the rate of convergence and control effort are to the weights. Figure 320 through Figure 324 are the results of not considering the rate of convergence during the optimization (i.e. W=[0 1]). For W=[0 1], a very low control effort is achieved but for much of X 0 we haveγ < 0 . Because we have open level sets, even if γ > 0 , the trajectory can exit X 0 into the region of state space where the performance has not been checked and no stability guarantees can be made. Although more research must be performed, it is reasonable to assume that some weight vector between W=[1 1] and W=[0 1] (setting w1 between 0 and 1) might yield a stable controller with a smaller value for max u than the linear LARC. Such a nonlinear dependence on W for the performance measures makes tuning W a nontrivial task. 50 Figure 313: States of System (3.3) Using LGAC (W=[1 0]) Figure 314: Control Signal of LGAC on System (3.3) (W=[1 0]) 51 Figure 315: Level Sets of V for LGAC on System (3.3) (W=[1 1]) Figure 316: “Gamma” (γ ) for LGAC on System (3.3) (W=[1 1]) 52 Figure 317: Critical Points for LGAC Population Using System (3.3) (W=[1 1]) Figure 318: States of System (3.3) Using LGAC (W=[1 1]) 53 Figure 319: Control Signal Profile of LGAC on System (3.3) (W=[1 1]) Figure 320: Level Sets of V for LGAC on System (3.3) (W=[0 1]) 54 Figure 321: “Gamma” (γ ) for LGAC on System (3.3) (W=[0 1]). Figure 322: Critical Points for LGAC Population Using System (3.3) (W=[0 1]) 55 Figure 323: States of System (3.3) Using LGAC (W=[0 1]) Figure 324: Control Signal of LGAC on System (3.3) (W=[0 1]) 56 Summarizing the comparison between the linear LARC and the linear GAC on the artificial system (3.3), we make the following observations. The linear GAC yields a strictly positive γ on X 0 for the two instances that the rate of convergence has an effect on the fitness function (W=[1 0] & W=[1 1]). The minimum rate of convergence is higher for the linear GAC for the weight settings W=[1 0] and W=[1 1], but the maximum control effort is also much higher. The linear GAC for the weight setting W=[0 1] yields a much lower maximum control effort than the linear LARC but the controller is unstable which raises the question: With the correct setting of W can both min γ and max u be improved on X 0 ? Although further research would be needed, viewing the trend of min γ and max u with the settings of W in Table 37, it is reasonable to assume such a weight setting exists. 3.2.2 GAC Case #2: Nonlinear GAC We now consider nonlinear control of the artificial system (3.3). The genetic algorithm parameters for this case are given in Table 39. System: Artificial (3.3) Controller Type: nonlinear Generations: 50 Population Size: 50 Checking points: 100 Critical Points: 33 E Δ : 100 max K : 1000 c: 0 X0 Range: [ 25 25] 1 x ∈ − [ 100 100] 2 x ∈ − Table 39: GA Parameters to Optimize NGAC for System (3.3) Table 310 lists the results of the optimization of the nonlinear GAC for the artificial system (3.3) using the parameters from Table 39. As with the case of the linear GAC, the GA found nondiagonal solutions for Q in all three cases. 57 Weight Case Weights P K Q 1 W=[1 0] 292.952 239.153 239.153 195.615 597.88 489.04 3.5160x105 2.8683x105 2.8683x105 2.3398x105 2 W=[1 1] 1664.4 348.60 348.60 77.100 871.55 192.71 7.2631x105 1.3836x105 1.3836x105 0.2661x105 3 W=[0 1] 3.9109 3.5506 3.5506 3.5455 8.8766 8.8638 0.5750 0.5136 0.5136 0.4625 Table 310: Optimized Parameters of NGAC for System (3.3) Table 311 displays the randomly sampled performance of the nonlinear GAC on the artificial system (3.3). Here we see that the rates of convergence are positive even for the case where the rate of convergence is not a factor in the fitness function (W=[0 1]). The control effort is, however, much higher for the nonlinear GAC than the linear GAC, similar to the comparison between the LARC and the nonlinear controller using P from the LARC optimization. A much better min γ is achieved along with a much better max u using the nonlinear GAC, but the nonlinear GAC has a higher average control effort than the nonlinear LARC. If such a feature is undesirable to the control engineer, the GA fitness function may be easily augmented with a term that favors low average control effort, although more points in 0 d X would be required to get a realistic estimate of the average control effort. Weights min γ γ μ γ σ max u u μ u σ W=[1 0] 12.685 1446.3 679.85 68306 25642 15525 W=[1 1] 7.8767 476.99 251.86 51756 14837 10615 W=[0 1] 0.0384 53.520 57.022 62062 1394.0 2103.7 Table 311: Performance of NGAC on System (3.3) Table 312 presents the randomly sampled performance of the nonlinear GAC on the linearized artificial system (3.4). In all three cases, the local performance substantially differs from that across the entire set X 0 . 58 Weights min γ γ μ γ σ max u u μ u σ W=[1 0] 2.3226 2394.2 169.28 6362.4 25038 15347 W=[1 1] 45.683 786.90 205.97 40986 13668 9612.2 W=[0 1] 0.0287 20.980 3.6402 1102.7 452.00 269.85 Table 312: Performance of NGAC on Linearized System (3.4) Figure 325, Figure 326, and Figure 327 contain plots of the CLF with the state trajectory for the initial condition (10,20), the rate of convergenceγ , and the locations of the estimated critical points for all controllers in the GA population. The GA yielded open level sets on X 0 . The sampled performance yielded 2.3226 min γ = , however the true value is clearly negative given the unstable trajectory. This is a problem that occurs when the ratio λ (P) /λ (P) is small (e.g. λ (P) /λ (P)= 4.68x104 for the W=[1 0] case). The level sets are nearly open andγ is actually slightly negative at some points along the line that the trajectory follows to exit X 0 in Figure 325. Trajectories originating in this region are pulled into the crevice of negative γ values and follow it to the outside of X 0 . The points along this line (“escape manifold”) are hard to find with a small checking set, causing the GA to use a bad estimate of the true value of min γ when evaluating the controllers. One solution is to simply increase the number of checking points in 0 d X . The next GA optimization run (“Case #3”) in the discussion is based on this approach. The ultimate solution, recommended for future research, is to guarantee X 0 contains only closed level sets. That is, guarantee that V (x) =V ( y), ∀x, y∈∂X 0 . 59 Figure 325: Level Sets of V for NGAC on System (3.3) (W=[1 0]) Figure 326: “Gamma” (γ ) for NGAC on System (3.3) (W=[1 0]) 60 Figure 327: Critical Points for NGAC Population Using System (3.3) (W=[1 0]) Figure 328: States of System (3.3) Using NGAC (W=[1 0]) 61 Figure 329: Control Signal of NGAC on System (3.3) (W=[1 0]) The case with W=[1 1] (Figure 330 through Figure 334) yields a stable trajectory. In fact, the nonlinear GAC with W=[1 1] yields a trajectory that converges about twice as fast with about half the maximum control effort than the nonlinear LARC. The design objective is therefore carried despite the problem of not using a large enough checking set 0 d X . 62 Figure 330: Level Sets of V for NGAC on System (3.3) (W=[1 1]) Figure 331: “Gamma” (γ ) for NGAC on System (3.3) Using W=[1 1] 63 Critcal Points x2 x1 Figure 332: Critical Points for NGAC Population Using System (3.3) (W=[1 1]) Figure 333: States of System (3.3) Using NGAC (W=[1 1]) 64 Figure 334: Control Signal of NGAC on System (3.3) (W=[1 1]) Weight case 3 suffers from the same problem as weight case 1 (Figure 335 through Figure 339). However, using more search points or guaranteeing V (x) =V ( y), ∀x, y∈∂X 0 would not necessarily guarantee a stable controller since min γ has zero weight in the fitness function. Case 3 is used to show the GA’s ability to minimize the maximum control effort of the given CLF and controller topologies. 65 Figure 335: Level Sets of V for NGAC on System (3.3) (W=[0 1]) Figure 336: “Gamma” (γ ) for NGAC on Artificial System (3.3) (W=[0 1]) 66 Figure 337: Critical Points for NGAC Population Using System (3.3) (W=[0 1]) Figure 338: States of System (3.3) Using NGAC (W=[0 1]) 67 Figure 339: Control Signal of NGAC on System (3.3) (W=[0 1]) 3.2.3 GAC Case #3: Denser Checking Sets The genetic algorithm parameters for Case #3 are given in Table 313. We use the same GA parameters as Case #2 with nonlinear control, but the number of checking points in 0 d X has been doubled. The intention is to catch the numerically illcondition “crevice” in the γ plots of Case #2 and tune P such that the λ (P) /λ (P) ratio is not too small, yet is not restricted by some predefined lower threshold. System: Artificial (3.3) Controller Type: nonlinear Generations: 20 Population Size: 50 Checking points: 200 Critical Points: 100 E Δ : 100 max K : 1000 c: 0 X0 Range: [ 25 25] 1 x ∈ − [ 100 100] 2 x ∈ − Table 313: GA Parameters to Optimize NGAC for System (3.3) (High Density 0 d X ) 68 Table 314 lists the results of the optimization of the nonlinear GAC for the artificial system (3.3) using the parameters from Table 313. Comparing Table 310 to Table 314 the doubling of the size of 0 d X seems to have substantially affected the GA’s output for the Weight Case 1. Weight Case Weights P K Q 1 W=[1 0] 4976.2 339.00 339.00 36.100 847.46 90.348 5.8939x105 19600 19600 1060 2 W=[1 1] 4287.3 376.90 376.90 45.800 942.37 114.517 8.0232x105 60900 60900 5480 3 W=[0 1] 3.8772 3.5242 3.5242 3.5240 8.8106 8.8100 0.0829 0.0751 0.0751 0.0688 Table 314: Optimized Parameters of NGAC for System (3.3) (High Density 0 d X ) Table 315 displays the randomly sampled performance of the nonlinear GAC on the artificial system (3.3) using double the number of elements in 0 d X . Table 316 displays the randomly sampled performance of the nonlinear GAC on the linearized artificial system (3.4) using double the number of elements in 0 d X . The change in the performance of the nonlinear GAC for W=[1 0] between Case #2 and Case #3 is understandable given the substantial change in P and K. The change in min γ of the nonlinear GAC for W=[0 1] between Case #2 and Case #3 was not expected because P and K are very similar between the two cases. A likely cause of the discrepancy is the smallλ (P) /λ (P)= 0.0238 ratio where the true value of min γ lies on a thin “escape manifold” as discussed in Case #2. 69 Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 32.628 174.013 62.836 87234 14322 12010 2 W=[1 1] 33.280 234.056 101.40 73692 14871 11249 3 W=[0 1] 0.3079 53.8784 57.666 57181 1399.1 2042.3 Table 315: Performance of NGAC on System (3.3) (High Density 0 d X ) Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 151.578 254.15 31.559 30036 11165 7067.5 2 W=[1 1] 128.115 361.94 75.626 34929 12711 8190.3 3 W=[0 1] 0.00120 20.715 3.6789 1090.8 447.79 268.42 Table 316: Performance of NGAC on Linearized System (3.4) (High Density 0 d X ) For Weight Cases 1 and 2, Figure 340 & Figure 345 display closed V contours, showing stable trajectories and well conditionedγ ’s (Figure 341 & Figure 346). Figure 351 still possesses the negativeγ as in Case #2, however this is expected because the GA search ignores γ altogether. Hence, it is advisable to put some small weight on γ when searching for controllers that yield low max u . Comparing the state and control response of the nonlinear LARC in Figure 38 & Figure 39 to those of the nonlinear GAC with weight settings W=[1 0] and W=[1 1] in Figure 343, Figure 344, Figure 348, & Figure 349, we see that the nonlinear GAC converges about 4 times faster with about 20% less maximum control effort. 70 Figure 340: Level Sets of V for NGAC on System (3.3) (W=[1 0], High Density 0 d X ) Figure 341: “Gamma” (γ ) for NGAC on System (3.3) (W=[1 0], High Density 0 d X ) 71 Figure 342: Critical Points for NGAC Using System (3.3) (W=[1 0], High Density 0 d X ) Figure 343: States of System (3.3) Using NGAC (W=[1 0], High Density 0 d X ) 72 Figure 344: Control Signal of NGAC on System (3.3) (W=[1 0], High Density 0 d X ) Figure 345: Level Sets of V for NGAC on System (3.3) (W=[1 1], High Density 0 d X ) 73 Figure 346: “Gamma” (γ ) for NGAC on System (3.3) (W=[1 1], High Density 0 d X ) Figure 347: Critical Points for NGAC Using System (3.3) (W=[1 1], High Density 0 d X ) 74 Figure 348: States of Artificial System (3.3) Using NGAC (W=[1 1], High Density 0 d X ) Figure 349: Control Signal of NGAC on System (3.3) (W=[1 1], High Density 0 d X ) 75 Figure 350: Level Sets of V for NGAC on System (3.3) (W=[0 1], High Density 0 d X ) Figure 351: “Gamma” (γ ) for NGAC on System (3.3) (W=[0 1], High Density 0 d X ) 76 Figure 352: Critical Points for NGAC Using System (3.3) (W=[0 1], High Density 0 d X ) Figure 353: States of System (3.3) Using NGAC (W=[0 1], High Density 0 d X ) 77 Figure 354: Control Signal of NGAC on System (3.3) (W=[0 1], High Density 0 d X ) In conclusion, we have seen in this example that while the LARC, by design, has a larger region of stability, the GAC can yield a better performance in terms of minimum rate of convergence and maximum control effort. The GAC also offers the ability to select the region of state space to perform the optimization. To answer question 3.1, checking the performance of the nonlinear system directly on 0 d X using the CLF,V = xT Px , does offer a clear advantage over the robust linear control theory based procedure of Ngamsom’s LARC. It is important, however, to set both elements of W to a positive number and use a sufficient number of elements in 0 d X (found via experimentation) to yield closed level sets for the CLF (λ (P) /λ (P)>>0). 78 3.3 Example 2: Double Inverted Pendulum System The system in Figure 355 was taken originally from Misawa (1995) where it is described in detail. Figure 355: Double Inverted Pendulum System (Misawa et al, 1995) The equations of motion are x& = f (x)+ g(x)u (3.5) where x∈ℜ4 , f , g :ℜ4 →ℜ4 , u ∈ℜ, and ( ( ) ( )) ( ) ( ) ( ) ( ) ( ( ) ( )) ( ) ( ) ( ) ( ) 1 2 2 3 4 2 2 1 2 3 4 2 1 2 2 3 1 2 4 4 1 2 2 4 1 2 3 1 2 4 3 4 2 1 2 2 1 2 3 3 2 4 1 3 5.9809 cos 5.3371sin 1.5071 1.5071 257.6614sin 0.8774 sin 0.2824 191.2383sin cos 5.9809 cos 0.9833 1.1206sin 0.3165 214.3082sin sin 0.2824 0.2824 48.2776sin cos x x x x x x x x x x x x x x x x f x x x x x x x x x x x x x x x x f f x f x − + − ⎪⎭ ⎪⎬ ⎫ ⎪⎩ ⎪⎨ ⎧ − − − + − − − − + + − = − + − ⎪⎭ ⎪⎬ ⎫ ⎪⎩ ⎪⎨ ⎧ + + − − − − + − + − = = = gravity x1 x2 u 79 ( ) ( ) ( 1 2 ) 2 1 2 4 1 2 3 2 2 1 5.9809 cos 504.2688cos 5.9809 cos 565.1008 0 0 x x g x x x x g g g − + − − = − + − − = = = The linearized dynamics about the upright position are: x& = Ax + Bu , with ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = 0 13.3356 0.4787  0.3593 43.0260  9.6925  0.2541 0.1202 0 0 0 1 0 0 1 0 A (3.6) ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − = 101.2405 113.4536 0 0 B 3.3.1 LARC Results Two ranges of the checking set 0 d X were used for all controllers in this example. The small range, x ∈[− 0.1 0.1],i =1,2,3,4 i , is used to capture the local slightly nonlinear behavior about the origin. The large range, x ∈[−1 1],i =1,2,3,4 i , is used to capture the highly nonlinear behavior away from the origin. Table 317 displays the LARC results for the double inverted pendulum system as reported in Ngamsom (2001). P K Q 0.3882 0.9764 0.1241 0.1412 0.9764 9.2923 1.0084 1.2328 0.1241 1.0084 0.1211 0.1429 0.1412 1.2328 0.1429 0.1759 0.1036 5.2008 0.3618 0.8021 2 0 0 0 0 2 0 0 0 0 2 0 0 0 0 2 Table 317: LLARC Parameters for System (3.5) (Ngamsom, 2001) 80 Table 318 lists the randomly sampled performance of the linear LARC on the double inverted pendulum system (3.5). For the small range, min γ is negative (although Ngamsom reports the system to be stable). It turns out that the system is unstable with respect to the particular quadratic CLF dictated by P in Table 317 and not necessarily quadratically unstable. For the large range, Ngamsom (2001) reports the system to be unstable. The min γ value found here is highly negative supporting Ngamsom’s report. X0 Range min γ γ μ γ σ max u u μ u σ [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi 2.6472 0.0127 1.5402 0.6262 0.2628 0.1543 [ ] 1,2,3,4 1 1 = ∈ − i xi 166.31 30.482 36.980 6.4308 2.6432 1.5451 Table 318: Performance of LLARC on System (3.5) Table 319 shows the randomly sampled performance of the nonlinear LARC on the double inverted pendulum system (3.5). Because the linear LARC performs poorly for the large range, we may attest that the system nonlinearities cause the nonlinear controller to use high control effort to keep the state trajectory within the level sets whose “angle” and “skewing” (if one can visualize in 4D!) are optimized for the locally linear behavior of the system. X0 Range min γ γ μ γ σ max u u μ u σ [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi 0.4378 10.837 4.3018 1.4729 0.5487 0.3318 [ ] 1,2,3,4 1 1 = ∈ − i xi 0.0914 11.313 4.5151 17982 12741 174.81 Table 319: Performance of NLARC on System (3.5) Table 320 lists the randomly sampled performance of the linear LARC on the linearized double inverted pendulum system (3.6). The rate of convergence is negative 81 for the linear LARC on the nonlinear dynamics and relatively slow for the linear LARC on the linear dynamics which suggests that significant nonlinearities will destabilize the system. X0 Range min γ γ μ γ σ max u u μ u σ [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi 0.0208 0.4217 1.4648 0.6187 0.2634 0.1546 [ ] 1,2,3,4 1 1 = ∈ − i xi 0.0207 0.4017 1.2807 6.2708 2.6458 1.5476 Table 320: Performance of LLARC on Linearized System (3.6) Figure 356 & Figure 357 contain the system signal responses to the initial condition (0.5,0,0,0) for the linear LARC on the double inverted pendulum system (3.5). Figure 358 & Figure 359 contain the system signal responses to the initial condition (0.5,0,0,0) for the nonlinear LARC on the double inverted pendulum system (3.5). The nonlinear LARC response is interesting with the chattering of the control signal (Figure 359) hence the chattering of the state variables. Such is the nature of the nonlinear controller combined with a finite time stepping numerical simulation method. The infamous denominator term of the Sontaglike control law, xT Pg , gets very close to zero and rapidly switches sign during the first second of the simulation. With progressively smaller time stepping, the switching is less severe, however the smallness of the time step becomes impractically small to produce a reasonably long enough simulation. The problem reflects an actual control law physical implementation issue: discretetime sampling. The digital implementation of the control law will have the same problem that most likely will be even worse given the limitations of sampling rates in realtime digital control systems. 82 Figure 356: States of System (3.5) Using LLARC Figure 357: Control Signal of System (3.5) Using LLARC 83 Figure 358: States of System (3.5) Using NLARC Figure 359: Control Signal of System (3.5) Using NLARC 84 3.3.2 Linear GAC with Small Checking Set Range Table 321 lists the genetic algorithm parameters used to optimize the linear GAC for the double inverted pendulum system (3.5) using a small 0 d X range. The checking set range is set low to minimize the effects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: linear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi Table 321: GA Parameters to Optimize LGAC for System (3.5) (Small 0 d X Range) Table 322 shows the genetic algorithm results of the optimization of the linear GAC for the double inverted pendulum system (3.5) using a small 0 d X range. It is interesting that the optimal Q has a structure such that the first diagonal element is dominate and all other elements are almost negligible besides the terms involving the state 1 x . It is also interesting that some terms are negative which suggests that an optimal controller will ensure that the states that are coupled to these terms are opposite in sign during most of the state trajectory. This type of solution is not obvious so would probably not have been selected by a human designer, yet it becomes a useful insight for the controller design. 85 Weight Case Weights P K Q 1 W=[1 0] 0.3010 0.6116 0.0781 0.0914 0.6116 2.0038 0.2208 0.2984 0.0781 0.2208 0.0258 0.0330 0.0914 0.2984 0.0330 0.0445 0.3922 5.1569 0.4182 0.7570 0.4513 0.0053 0.0021 0.0041 0.0053 0.0001 0.0000 0.0000 0.0021 0.0000 0.0001 0.0001 0.0041 0.0000 0.0001 0.0002 2 W=[1 1] 0.1179 0.2750 0.0367 0.0418 0.2750 1.2195 0.1267 0.1819 0.0367 0.1267 0.0144 0.0190 0.0418 0.1819 0.0190 0.0272 0.0676 4.0455 0.2904 0.5908 0.0559 0.0019 0.0010 0.0004 0.0019 0.0001 0.0000 0.0001 0.0010 0.0000 0.0000 0.0000 0.0004 0.0001 0.0000 0.0001 3 W=[0 1] 0.1300 0.2955 0.0392 0.0448 0.2955 1.2698 0.1327 0.1895 0.0392 0.1327 0.0151 0.0199 0.0448 0.1895 0.0199 0.0283 0.0901 4.1270 0.2997 0.6029 0.0765 0.0021 0.0007 0.0010 0.0021 0.0001 0.0000 0.0000 0.0007 0.0000 0.0000 0.0000 0.0010 0.0000 0.0000 0.0000 Table 322: Optimized Parameters of LGAC for System (3.5) (Small 0 d X Range) Table 323 displays the randomly sampled performance of the linear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. Unlike the linear LARC (Table 317), the linear GAC yields a positive minimum rate of convergence with a much higher average rate of convergence and without a significant difference between control effort or gains for all sets of fitness function weights. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.2212 12.6432 6.5883 0.6455 0.2597 0.1526 2 W=[1 1] 0.0568 12.6826 6.5130 0.4851 0.2049 0.1192 3 W=[0 1] 0.0739 12.5958 6.4883 0.5028 0.2090 0.1229 Table 3.22: Performance of LGAC on System (3.5) (Small 0 d X Range) Table 323 displays the randomly sampled performance of the linear GAC on the linearized double inverted pendulum system (3.6) using a small 0 d X range. The rate of convergence is fairly close to that of Table 322 which suggests that the nonlinearities do not play as significant a role locally as they do with the linear LARC where the difference in min γ between the nonlinear and linear dynamics is substantial. 86 Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1113 13.1476 6.9741 0.6540 0.2605 0.1539 2 W=[1 1] 0.0799 13.0182 6.8206 0.4877 0.2048 0.1198 3 W=[0 1] 0.0168 13.0161 6.8388 0.5043 0.2068 0.1218 Table 323: Performance of LGAC on Linearized System (3.6) (Small 0 d X Range) Figure 360 through Figure 365 contain the system signal responses to the initial condition (0.5,0,0,0) for the linear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. The weight setting W=[1 0] yields a much faster response but at the cost of more control effort than the linear LARC. The weight setting W=[1 1] does not yield a noticeably different response speed than the linear LARC, but the control effort is higher. The weight setting W=[0 1] yields a much faster response and with slightly more control effort than the linear LARC. The linear GAC for W=[0 1] would probably be selected over the linear LARC. 87 Figure 360: States of System (3.5) Using LGAC (W=[1 0], Small 0 d X Range) Figure 361: Control Signal of System (3.5) Using LGAC (W=[1 0], Small 0 d X Range) 88 Figure 362: States of System (3.5) Using LGAC (W=[1 1], Small 0 d X Range) Figure 363: Control Signal of System (3.5) Using LGAC (W=[1 1], Small 0 d X Range) 89 Figure 364: States of System (3.5) Using LGAC (W=[0 1], Small 0 d X Range) Figure 365: Control Signal of System (3.5) Using LGAC (W=[0 1], Small 0 d X Range) 90 3.3.3 Linear GAC with Large Checking Set Range Table 324 shows the genetic algorithm parameters used to optimize the linear GAC for the double inverted pendulum system (3.5) using a large 0 d X range. The checking set range is set high to demonstrate the affects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: linear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 1 1 = ∈ − i xi Table 324: GA Parameters to Optimize LGAC for System (3.5) (Large 0 d X Range) Table 325 displays the GA results of the optimization of the linear GAC for the double inverted pendulum system using a large X 0 range. The results are somewhat similar to those of the linear GAC run with a small X 0 range indicating low sensitivity to the X 0 range. Weight Case Weights P K Q 1 W=[1 0] 0.1009 0.2370 0.0322 0.0363 0.2370 1.1231 0.1154 0.1678 0.0322 0.1154 0.0131 0.0174 0.0363 0.1678 0.0174 0.0251 0.0237 3.8883 0.2727 0.5673 0.0186 0.0019 0.0002 0.0008 0.0019 0.0002 0.0000 0.0000  0.0002 0.0000 0.0000 0.0001 0.0008 0.0000 0.0001 0.0002 2 W=[1 1] 0.1724 0.3756 0.0496 0.0574 0.3756 1.4711 0.1574 0.2202 0.0496 0.1574 0.0182 0.0237 0.0574 0.2202 0.0237 0.0330 0.1804 4.4422 0.3357 0.6500 0.1694 0.0010 0.0005 0.0025 0.0010 0.0000 0.0000 0.0000 0.0005 0.0000 0.0000 0.0000 0.0025 0.0000 0.0000 0.0001 3 W=[0 1] 0.1581 0.3498 0.0458 0.0528 0.3498 1.4030 0.1484 0.2091 0.0458 0.1484 0.0170 0.0223 0.0528 0.2091 0.0223 0.0312 0.1486 4.3311 0.3230 0.6336 0.1348 0.0014 0.0011 0.0017 0.0014 0.0001 0.0000 0.0001 0.0011 0.0000 0.0000 0.0000 0.0017 0.0001 0.0000 0.0002 Table 325: Optimized Parameters of LGAC for System (3.5) (Large 0 d X Range) 91 Table 326 lists the randomly sampled performance of the linear GAC on the double inverted pendulum system (3.5) using a large 0 d X range. Weight case 1 yielded a faster rate of convergence than the LARC, but a larger maximum control effort. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1073 9.7520 3.4343 45061 11.004 332.63 2 W=[1 1] 158.19 17.641 35.302 5.4339 2.2438 1.3166 3 W=[0 1] 148.99 16.986 34.249 5.2611 2.1870 1.2875 Table 326: Performance of LGAC on System (3.5) (Large 0 d X Range) Table 327 displays the randomly sampled performance of the linear GAC on the linearized double inverted pendulum system (3.6) using a large 0 d X range. Notice min γ is negative for W=[0 1]. This is because the minimum eigenvalue for P is slightly negative (λ (P) = −1.4331x10−5 ) due to Q having an eigenvalue close to zero. The linear system is stable, with the closed loop poles for the linearized dynamics all in the left half plane closed loop poles = {− 7.2235 ± 2.0095i,−7.7300,−5.9369}). However it is unstable with respect to the CLF. Though it is expected when not checking the rate of convergence, this sort of problem is solved by setting c>0 in Table 324 such that the smallest eigenvalue of Q is equal to c. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.0604 13.024 6.7999 4.6734 1.9640 1.1540 2 W=[1 1] 0.0235 13.001 6.7290 5.3853 2.2410 1.3147 3 W=[0 1] 0.7703 12.994 6.8051 5.2696 2.1968 1.2867 Table 327: Performance of LGAC on Linearized System (3.6) (Large 0 d X Range) Figure 366 through Figure 371 contain the system signal responses to the initial condition (0.5,0,0,0) for the linear GAC on the double inverted pendulum system (3.5) 92 using a large 0 d X range. The controller gains of this set of controllers are similar, hence the dynamic responses are similar; none of them being an improvement over the linear LARC. It is suspected that the number of checking points used is insufficient for the large 0 d X range. The same number of points was used in this case as with the smaller 0 d X range, meaning the estimates of min γ and max u were better for the small 0 d X range, making the controllers better. The large range is 10 times larger than the small range. Given a 4th order system, to duplicate the checking set density of the small range for the large range, the large range checking set size would have to be increased by a factor of 10,000! We now see the major drawback of direct pointwise CLF checking of performance. The random search of the critical points relaxes the checking set size requirements, however the sufficient number of checking points still becomes prohibitively large for higher dimensional systems. Future research should focus on reducing the scale rate of the required checking set size with the system dimension. 93 Figure 366: States of System (3.5) Using LGAC (W=[1 0], Large 0 d X Range) Figure 367: Control Signal of System (3.5) Using LGAC (W=[1 0], Large 0 d X Range) 94 Figure 368: States of System (3.5) Using LGAC (W=[1 1], Large 0 d X Range) Figure 369: Control Signal of System (3.5) Using LGAC (W=[1 1], Large 0 d X Range) 95 Figure 370: States of System (3.5) Using L GAC (W=[0 1], Large 0 d X Range) Figure 371: Control Signal of System (3.5) Using LGAC (W=[0 1], Large 0 d X Range) 96 3.3.4 Nonlinear GAC with Small Checking Set Range Table 328 shows the genetic algorithm parameters used to optimize the nonlinear GAC for the double inverted pendulum system (3.5) using a small 0 d X range. The checking set range is set small to minimize the effects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: nonlinear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi Table 328: GA Parameters to Optimize NGAC for System (3.5) (Small 0 d X Range) Table 329 displays the GA results of the optimization of the nonlinear GAC for the double inverted pendulum system (3.5) using a small X 0 range. The results are similar to those of the linear GAC run. Weight Case Weights P K Q 1 W=[1 0] 0.1536 0.3442 0.0453 0.0522 0.3442 1.3934 0.1474 0.2079 0.0453 0.1474 0.0169 0.022 0.0522 0.2079 0.0222 0.0311 0.1442 4.3183 0.3216 0.6321 0.1301 0.0019 0.0011 0.0000 0.0019 0.0000 0.0000 0.0000 0.0011 0.0000 0.0001 0.0002 0.0000 0.0000 0.0002 0.0006 2 W=[1 1] 0.1524 0.3407 0.0449 0.0518 0.3407 1.3835 0.1465 0.2067 0.0449 0.1465 0.0168 0.0220 0.0518 0.2067 0.0220 0.0309 0.1420 4.3067 0.3202 0.6298 0.1279 0.0021 0.0011 0.0007 0.0021 0.0002 0.0000 0.0001 0.0011 0.0000 0.0001 0.0001 0.0007 0.0001 0.0001 0.0001 3 W=[0 1] 0.1665 0.3652 0.0480 0.0554 0.3652 1.4418 0.1535 0.2154 0.0480 0.1535 0.0177 0.0231 0.0554 0.2154 0.0231 0.0322 0.1666 4.3942 0.3302 0.6428 0.1541 0.0022 0.0005 0.0002 0.0022 0.0001 0.0000 0.0000 0.0005 0.0000 0.0000 0.0000 0.0002 0.0000 0.0000 0.0000 Table 329: Optimized Parameters of NGAC for System (3.5) (Small 0 d X Range) Table 330 lists the randomly sampled performance of the nonlinear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. The nonlinear LARC’s minimum rate of convergence is about twice as fast as the nonlinear GAC’s, but the 97 nonlinear LARC’s maximum control effort is over twice as much as the nonlinear GAC’s. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.2230 12.9758 6.7768 0.5439 0.2201 0.1311 2 W=[1 1] 0.0416 12.9448 6.7075 0.5349 0.2216 0.1311 3 W=[0 1] 0.1084 12.9666 6.7272 0.5400 0.2256 0.1338 Table 330: Performance of NGAC on System (3.5) (Small 0 d X Range) Table 331 tabulates the randomly sampled performance of the nonlinear GAC on the linearized double inverted pendulum system (3.6) using a small 0 d X range. As with the linear GAC, the linearized dynamics are close to the nonlinear dynamics locally decreasing the effect of the system nonlinearities. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.2890 13.0663 6.9025 0.5251 0.2170 0.1291 2 W=[1 1] 0.1984 12.9550 6.7552 0.5317 0.2176 0.1283 3 W=[0 1] 0.0433 12.9104 6.8273 0.5330 0.2215 0.1303 Table 331: Performance of NGAC on Linearized System (3.6) (Small 0 d X Range) Figure 372 through Figure 377 contain the system signal responses to the initial condition (0.5,0,0,0) for the nonlinear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. As with the nonlinear LARC, the nonlinear GAC for the first two weight settings yields a choppy control response that in turn causes the response be choppy. The weight setting W=[0 1] is ideal and is a major improvement over that of the nonlinear LARC. 98 Figure 372: States of System (3.5) Using NGAC (W=[1 0], Small 0 d X Range) Figure 373: Control Signal of System (3.5) Using NGAC (W=[1 0], Small 0 d X Range) 99 Figure 374: States of System (3.5) Using NGAC (W=[1 1], Small 0 d X Range) Figure 375: Control Signal of System (3.5) Using NGAC (W=[1 1], Small 0 d X Range) 100 Figure 376: States of System (3.5) Using NGAC (W=[0 1], Small 0 d X Range) Figure 377: Control Signal of System (3.5) Using NGAC (W=[0 1], Small 0 d X Range) 101 3.3.5 Nonlinear GAC with Large Checking Set Range Table 332 shows the genetic algorithm parameters used to optimize the nonlinear GAC for the double inverted pendulum system (3.5) using a large 0 d X range. The checking set range is set high to demonstrate the affects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: nonlinear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 1 1 = ∈ − i xi Table 332: GA Parameters to Optimize NGAC for System (3.5) (Large 0 d X Range) Table 333 displays the GA results of the optimization of the nonlinear GAC for the double inverted pendulum system (3.5) using a large X 0 range. Weight Case Weights P K Q 1 W=[1 0] 0.1009 0.2370 0.0322 0.0363 0.2370 1.1231 0.1154 0.1678 0.0322 0.1154 0.0131 0.0174 0.0363 0.1678 0.0174 0.0251 0.0237 3.8883 0.2727 0.5673 0.0186 0.0019 0.0002 0.0008 0.0019 0.0002 0.0000 0.0000 0.0002 0.0000 0.0000 0.0001 0.0008 0.0000 0.0001 0.0002 2 W=[1 1] 0.2060 0.4427 0.0577 0.0670 0.4427 1.6272 0.1759 0.2432 0.0577 0.1759 0.0204 0.0264 0.0670 0.2432 0.0264 0.0364 0.2423 4.6634 0.3609 0.6830 0.2426 0.0081 0.0009 0.0009 0.0081 0.0003 0.0000 0.0000 0.0009 0.0000 0.0000 0.0000 0.0009 0.0000 0.0000 0.0000 3 W=[0 1] 0.1272 0.2925 0.0390 0.0445 0.2925 1.3032 0.1371 0.1950 0.0390 0.1371 0.0158 0.0207 0.0445 0.1950 0.0207 0.0292 0.0807 4.1851 0.3027 0.6116 0.0677 0.0005 0.0006 0.0001 0.0005 0.0001 0.0002 0.0000 0.0006 0.0002 0.0015 0.0003 0.0001 0.0000 0.0003 0.0001 Table 333: Optimized Parameters of NGAC for System (3.5) (Large 0 d X Range) Table 334 lists the randomly sampled performance of the nonlinear GAC on the double inverted pendulum system (3.5) using a large 0 d X range. 102 Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1073 9.7520 3.4343 45061 11.004 332.63 2 W=[1 1] 0.1202 10.819 3.6796 9902.3 9.0307 107.42 3 W=[0 1] 0.0798 9.9874 3.2041 9128.9 8.2549 92.792 Table 334: Performance of NGAC on System (3.5) (Large 0 d X Range) Table 335 tabulates the randomly sampled performance of the nonlinear GAC on the linearized double inverted pendulum system (3.6) using a large 0 d X range. In theory, the nonlinear GAC should yield a positive min γ for the linearized double inverted pendulum system since the nonlinear GAC becomes the LQR for Q in Table 333. However, weight case 3 yielded a P with an eigenvalue of 0.00001313 due to Q having eigenvalues so close to zero, which caused min γ to have a small negative value. Setting c>0 forces the eigenvalues of Q to be no less than c and alleviates the problem. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1505 12.9276 6.6813 5.0258 2.1126 1.2399 2 W=[1 1] 0.0933 13.1822 6.9588 5.9003 2.3706 1.3879 3 W=[0 1] 0.0045 12.9970 6.8068 4.6618 1.9632 1.1494 Table 335: Performance of NGAC on System (3.6) (Large 0 d X Range) Figure 378 through Figure 383 contain the system signal responses to the initial condition (0.5,0,0,0) for the nonlinear GAC on the double inverted pendulum system (3.5) using a large 0 d X range. All three sets of weights yield controller chattering. The cause is attri
Click tabs to swap between content that is broken into logical sections.
Rating  
Title  Using Genetic Algorithms to Optimize Control Lyapunov Functions 
Date  20080701 
Author  Hargrave, Brian K. 
Department  Mechanical Engineering 
Document Type  
Full Text Type  Open Access 
Note  Thesis 
Rights  © Oklahoma Agricultural and Mechanical Board of Regents 
Transcript  USING GENETIC ALGORITHMS TO OPTIMIZE CONTROL LYAPUNOV FUNCTIONS By BRIAN K. HARGRAVE Bachelor of Science in Mechanical Engineering Oklahoma State University Stillwater, OK 1999 Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of MASTER OF SCIENCE July, 2008 ii USING GENETIC ALGORITHMS TO OPTIMIZE CONTROL LYAPUNOV FUNCTIONS Thesis Approved: Dr. L. L. Hoberock Thesis Adviser Dr. P. R. Pagilla Dr. G. E. Young Dr. A. Gordon Emslie Dean of the Graduate College iii Table of Contents Chapter Page 1 INTRODUCTION................................................................................................................................1 1.1 MOTIVATION................................................................................................................................1 1.2 PROBLEM STATEMENT AND BACKGROUND THEORY.....................................................................7 1.3 THESIS OUTLINE .........................................................................................................................15 2 A GENETIC ALGORITHM FOR CLF OPTIMIZATION.................................................................17 2.1 GENETIC ALGORITHM MOTIVATION ...........................................................................................17 2.2 TAILORING TO THE SPECIFIC PROBLEM.......................................................................................18 2.3 GENETIC ALGORITHM DESCRIPTION ...........................................................................................21 2.4 STOCHASTIC ESTIMATION OF min γ .............................................................................................27 3 FULL STATEFEEDBACK CLF CONTROL....................................................................................32 3.1 SELECTION OF BENCHMARK EXAMPLES .....................................................................................32 3.2 EXAMPLE 1: ARTIFICIAL SYSTEM..............................................................................................35 3.3 EXAMPLE 2: DOUBLE INVERTED PENDULUM SYSTEM...............................................................78 3.4 EXAMPLE 3: CARTANDPOLE SYSTEM ...................................................................................106 3.5 DISCUSSION OF RESULTS ..........................................................................................................132 4 FUTURE DIRECTIONS AND CONCLUSIONS.............................................................................135 4.1 GENERALIZED CONTROL LYAPUNOV FUNCTIONS.....................................................................135 4.2 IMPROVED THE CRITICAL POINT COMPUTATION.......................................................................137 4.3 OUTPUTFEEDBACK, ADAPTIVE, AND ROBUST CONTROL .........................................................140 4.4 DISCRETETIME CONTROL........................................................................................................140 4.5 PUTTING IT ALL TOGETHER FOR PRACTICAL CONTROL DESIGN ..............................................141 4.6 CONCLUSION............................................................................................................................153 REFERENCES...........................................................................................................................................155 APPENDIX: MATLAB CODE LISTING AND HOW TO USE IT ..........................................................157 iv List of Tables Table Page TABLE 31: LLARC PARAMETERS FOR SYSTEM (3.3) (NGAMSOM, 2001) .....................................................36 TABLE 32: PERFORMANCE OF LLARC ON SYSTEM (3.3)..............................................................................36 TABLE 33: PERFORMANCE OF NLARC ON SYSTEM (3.3) .............................................................................36 TABLE 34: PERFORMANCE OF LLARC ON LINEARIZED SYSTEM (3.4)..........................................................37 TABLE 35: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.3) ..........................................................44 TABLE 36 OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.3)................................................................45 TABLE 37: PERFORMANCE OF LGAC ON SYSTEM (3.3) ................................................................................46 TABLE 38: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.4)............................................................46 TABLE 39: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.3)..........................................................56 TABLE 310: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.3) ............................................................57 TABLE 311: PERFORMANCE OF NGAC ON SYSTEM (3.3)..............................................................................57 TABLE 312: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.4) .........................................................58 TABLE 313: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.3) (HIGH DENSITY 0 d X )......................67 TABLE 314: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.3) (HIGH DENSITY 0 d X ) ..........................68 TABLE 315: PERFORMANCE OF NGAC ON SYSTEM (3.3) (HIGH DENSITY 0 d X )............................................69 TABLE 316: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.4) (HIGH DENSITY 0 d X ) .......................69 TABLE 317: LLARC PARAMETERS FOR SYSTEM (3.5) (NGAMSOM, 2001) ...................................................79 TABLE 318: PERFORMANCE OF LLARC ON SYSTEM (3.5)............................................................................80 TABLE 319: PERFORMANCE OF NLARC ON SYSTEM (3.5) ...........................................................................80 TABLE 320: PERFORMANCE OF LLARC ON LINEARIZED SYSTEM (3.6)........................................................81 TABLE 321: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE) .....................84 TABLE 322: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE)..........................85 TABLE 323: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.6) (SMALL 0 d X RANGE) .......................86 TABLE 324: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE) ......................90 TABLE 325: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE)...........................90 TABLE 326: PERFORMANCE OF LGAC ON SYSTEM (3.5) (LARGE 0 d X RANGE) ............................................91 TABLE 327: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.6) (LARGE 0 d X RANGE) ........................91 TABLE 328: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE)......................96 TABLE 329: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.5) (SMALL 0 d X RANGE) ..........................96 TABLE 330: PERFORMANCE OF NGAC ON SYSTEM (3.5) (SMALL 0 d X RANGE)............................................97 TABLE 331: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.6) (SMALL 0 d X RANGE) .......................97 TABLE 332: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE)....................101 TABLE 333: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.5) (LARGE 0 d X RANGE) ........................101 TABLE 334: PERFORMANCE OF NGAC ON SYSTEM (3.5) (LARGE 0 d X RANGE)..........................................102 TABLE 335: PERFORMANCE OF NGAC ON SYSTEM (3.6) (LARGE 0 d X RANGE)..........................................102 TABLE 336: LLARC PARAMETERS FOR SYSTEM (3.7) (NGAMSOM, 2001) .................................................108 TABLE 337: PERFORMANCE OF LLARC ON SYSTEM (3.7)..........................................................................108 TABLE 338: PERFORMANCE OF NLARC ON SYSTEM (3.7) .........................................................................108 v List of Tables (Cont'd) Table Page TABLE 339: PERFORMANCE OF LLARC ON LINEARIZED SYSTEM (3.8)......................................................109 TABLE 340: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE) ...................111 TABLE 341: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE)........................112 TABLE 342: PERFORMANCE OF LGAC ON SYSTEM (3.7) (SMALL 0 d X RANGE) .........................................112 TABLE 343: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.8) (SMALL 0 d X RANGE) ......................113 TABLE 344: GA PARAMETERS TO OPTIMIZE LGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE) ....................116 TABLE 345: OPTIMIZED PARAMETERS OF LGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE).........................117 TABLE 346: PERFORMANCE OF LGAC ON SYSTEM (3.7) (LARGE 0 d X RANGE) ..........................................117 TABLE 347: PERFORMANCE OF LGAC ON LINEARIZED SYSTEM (3.8) (LARGE 0 d X RANGE) ......................118 TABLE 348: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE)...................121 TABLE 349: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.7) (SMALL 0 d X RANGE) .......................122 TABLE 350: PERFORMANCE OF NGAC ON SYSTEM (3.7) (SMALL 0 d X RANGE).........................................122 TABLE 351: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.8) (SMALL 0 d X RANGE).....................123 TABLE 352: GA PARAMETERS TO OPTIMIZE NGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE)....................127 TABLE 353: OPTIMIZED PARAMETERS OF NGAC FOR SYSTEM (3.7) (LARGE 0 d X RANGE) ........................127 TABLE 354: PERFORMANCE OF NGAC ON SYSTEM (3.7) (LARGE 0 d X RANGE)..........................................128 TABLE 355: PERFORMANCE OF NGAC ON LINEARIZED SYSTEM (3.8) (LARGE 0 d X RANGE)......................128 vi List of Figures Figure Page FIGURE 21: FLOW OF FITNESS COMPUTATION ..............................................................................................23 FIGURE 22: GA OPTIMIZATION ALGORITHM ................................................................................................31 FIGURE 31: LEVEL SETS OF V FOR LLARC ON SYSTEM (3.3) .......................................................................37 FIGURE 32: “GAMMA” (γ ) FOR LLARC ON SYSTEM (3.3)..........................................................................38 FIGURE 33: STABILITY TABLE FROM NGAMSOM (2001) ...............................................................................39 FIGURE 34: STATES OF SYSTEM (3.3) USING LLARC...................................................................................40 FIGURE 35: CONTROL SIGNAL OF LLARC ON SYSTEM (3.3) ........................................................................40 FIGURE 36: LEVEL SETS OF V FOR NLARC ON SYSTEM (3.3).......................................................................42 FIGURE 37: “GAMMA” (γ ) FOR NLARC ON SYSTEM (3.3) .........................................................................42 FIGURE 38: STATES OF SYSTEM (3.3) USING NLARC ..................................................................................43 FIGURE 39: CONTROL SIGNAL OF NLARC ON SYSTEM (3.3)........................................................................43 FIGURE 310: LEVEL SETS OF V FOR LGAC ON SYSTEM (3.3) (W=[1 0]).......................................................47 FIGURE 311: “GAMMA” (γ ) FOR LGAC ON SYSTEM (3.3) (W=[1 0])..........................................................48 FIGURE 312: CRITICAL POINTS FOR LGAC POPULATION USING SYSTEM (3.3) (W=[1 0])............................48 FIGURE 313: STATES OF SYSTEM (3.3) USING LGAC (W=[1 0]) ..................................................................50 FIGURE 314: CONTROL SIGNAL OF LGAC ON SYSTEM (3.3) (W=[1 0])........................................................50 FIGURE 315: LEVEL SETS OF V FOR LGAC ON SYSTEM (3.3) (W=[1 1]).......................................................51 FIGURE 316: “GAMMA” (γ ) FOR LGAC ON SYSTEM (3.3) (W=[1 1])..........................................................51 FIGURE 317: CRITICAL POINTS FOR LGAC POPULATION USING SYSTEM (3.3) (W=[1 1])............................52 FIGURE 318: STATES OF SYSTEM (3.3) USING LGAC (W=[1 1]) ..................................................................52 FIGURE 319: CONTROL SIGNAL PROFILE OF LGAC ON SYSTEM (3.3) (W=[1 1])..........................................53 FIGURE 320: LEVEL SETS OF V FOR LGAC ON SYSTEM (3.3) (W=[0 1]).......................................................53 FIGURE 321: “GAMMA” (γ ) FOR LGAC ON SYSTEM (3.3) (W=[0 1])..........................................................54 FIGURE 322: CRITICAL POINTS FOR LGAC POPULATION USING SYSTEM (3.3) (W=[0 1])............................54 FIGURE 323: STATES OF SYSTEM (3.3) USING LGAC (W=[0 1]) ..................................................................55 FIGURE 324: CONTROL SIGNAL OF LGAC ON SYSTEM (3.3) (W=[0 1])........................................................55 FIGURE 325: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 0]) ......................................................59 FIGURE 326: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[1 0]) .........................................................59 FIGURE 327: CRITICAL POINTS FOR NGAC POPULATION USING SYSTEM (3.3) (W=[1 0]) ...........................60 FIGURE 328: STATES OF SYSTEM (3.3) USING NGAC (W=[1 0])..................................................................60 FIGURE 329: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 0]) .......................................................61 FIGURE 330: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 1]) ......................................................62 FIGURE 331: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) USING W=[1 1] .................................................62 FIGURE 332: CRITICAL POINTS FOR NGAC POPULATION USING SYSTEM (3.3) (W=[1 1]) ...........................63 FIGURE 333: STATES OF SYSTEM (3.3) USING NGAC (W=[1 1])..................................................................63 FIGURE 334: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 1]) .......................................................64 FIGURE 335: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[0 1]) ......................................................65 FIGURE 336: “GAMMA” (γ ) FOR NGAC ON ARTIFICIAL SYSTEM (3.3) (W=[0 1]) ......................................65 FIGURE 337: CRITICAL POINTS FOR NGAC POPULATION USING SYSTEM (3.3) (W=[0 1]) ...........................66 FIGURE 338: STATES OF SYSTEM (3.3) USING NGAC (W=[0 1])..................................................................66 FIGURE 339: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[0 1]) .......................................................67 FIGURE 340: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X )......................70 FIGURE 341: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X ) .........................70 FIGURE 342: CRITICAL POINTS FOR NGAC USING SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X )................71 FIGURE 343: STATES OF SYSTEM (3.3) USING NGAC (W=[1 0], HIGH DENSITY 0 d X ) .................................71 vii List of Figures (Cont'd) Figure Page FIGURE 344: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 0], HIGH DENSITY 0 d X ).......................72 FIGURE 345: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ) .....................72 FIGURE 346: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ) .........................73 FIGURE 347: CRITICAL POINTS FOR NGAC USING SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ) ...............73 FIGURE 348: STATES OF ARTIFICIAL SYSTEM (3.3) USING NGAC (W=[1 1], HIGH DENSITY 0 d X ) ..............74 FIGURE 349: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[1 1], HIGH DENSITY 0 d X ).......................74 FIGURE 350: LEVEL SETS OF V FOR NGAC ON SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X )......................75 FIGURE 351: “GAMMA” (γ ) FOR NGAC ON SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X ) .........................75 FIGURE 352: CRITICAL POINTS FOR NGAC USING SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X )................76 FIGURE 353: STATES OF SYSTEM (3.3) USING NGAC (W=[0 1], HIGH DENSITY 0 d X ) .................................76 FIGURE 354: CONTROL SIGNAL OF NGAC ON SYSTEM (3.3) (W=[0 1], HIGH DENSITY 0 d X ).......................77 FIGURE 355: DOUBLE INVERTED PENDULUM SYSTEM (MISAWA ET AL, 1995) .............................................78 FIGURE 356: STATES OF SYSTEM (3.5) USING LLARC.................................................................................82 FIGURE 357: CONTROL SIGNAL OF SYSTEM (3.5) USING LLARC.................................................................82 FIGURE 358: STATES OF SYSTEM (3.5) USING NLARC ................................................................................83 FIGURE 359: CONTROL SIGNAL OF SYSTEM (3.5) USING NLARC................................................................83 FIGURE 360: STATES OF SYSTEM (3.5) USING LGAC (W=[1 0], SMALL 0 d X RANGE)..................................87 FIGURE 361: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 0], SMALL 0 d X RANGE)..................87 FIGURE 362: STATES OF SYSTEM (3.5) USING LGAC (W=[1 1], SMALL 0 d X RANGE)..................................88 FIGURE 363: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 1], SMALL 0 d X RANGE)..................88 FIGURE 364: STATES OF SYSTEM (3.5) USING LGAC (W=[0 1], SMALL 0 d X RANGE)..................................89 FIGURE 365: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[0 1], SMALL 0 d X RANGE)..................89 FIGURE 366: STATES OF SYSTEM (3.5) USING LGAC (W=[1 0], LARGE 0 d X RANGE) ..................................93 FIGURE 367: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 0], LARGE 0 d X RANGE)..................93 FIGURE 368: STATES OF SYSTEM (3.5) USING LGAC (W=[1 1], LARGE 0 d X RANGE) ..................................94 FIGURE 369: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[1 1], LARGE 0 d X RANGE)..................94 FIGURE 370: STATES OF SYSTEM (3.5) USING L GAC (W=[0 1], LARGE 0 d X RANGE) .................................95 FIGURE 371: CONTROL SIGNAL OF SYSTEM (3.5) USING LGAC (W=[0 1], LARGE 0 d X RANGE)..................95 FIGURE 372: STATES OF SYSTEM (3.5) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .................................98 FIGURE 373: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .................98 FIGURE 374: STATES OF SYSTEM (3.5) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .................................99 FIGURE 375: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .................99 viii List of Figures (Cont'd) Figure Page FIGURE 376: STATES OF SYSTEM (3.5) USING NGAC (W=[0 1], SMALL 0 d X RANGE) ...............................100 FIGURE 377: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[0 1], SMALL 0 d X RANGE) ...............100 FIGURE 378: STATES OF SYSTEM (3.5) USING NGAC (W=[1 0], LARGE 0 d X RANGE)................................103 FIGURE 379: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 0], LARGE 0 d X RANGE) ...............103 FIGURE 380: STATES OF SYSTEM (3.5) USING NGAC (W=[1 1], LARGE 0 d X RANGE)................................104 FIGURE 381: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[1 1], LARGE 0 d X RANGE) ...............104 FIGURE 382: STATES OF SYSTEM (3.5) USING NGAC (W=[0 1], LARGE 0 d X RANGE)................................105 FIGURE 383: CONTROL SIGNAL OF SYSTEM (3.5) USING NGAC (W=[0 1], LARGE 0 d X RANGE) ...............105 FIGURE 384: STATES OF SYSTEM (3.7) USING LLARC...............................................................................109 FIGURE 385: CONTROL SIGNAL OF SYSTEM (3.7) USING LLARC...............................................................110 FIGURE 386: STATES OF SYSTEM (3.7) USING NLARC ..............................................................................110 FIGURE 387: CONTROL SIGNAL OF SYSTEM (3.7) USING NLARC..............................................................111 FIGURE 388: STATES OF SYSTEM (3.7) USING LGAC (W=[1 0], SMALL 0 d X RANGE)................................113 FIGURE 389: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 0], SMALL 0 d X RANGE)................114 FIGURE 390: STATES OF SYSTEM (3.7) USING LGAC (W=[1 1], SMALL 0 d X RANGE)................................114 FIGURE 391: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 1], SMALL 0 d X RANGE)................115 FIGURE 392: STATES OF SYSTEM (3.7) USING LGAC (W=[0 1], SMALL 0 d X RANGE)................................115 FIGURE 393: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[0 1], SMALL 0 d X RANGE)................116 FIGURE 394: STATES OF SYSTEM (3.7) USING LGAC (W=[1 0], LARGE 0 d X RANGE) ................................118 FIGURE 395: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 0], LARGE 0 d X RANGE)................119 FIGURE 396: STATES OF SYSTEM (3.7) USING LGAC (W=[1 1], LARGE 0 d X RANGE) ................................119 FIGURE 397: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[1 1], LARGE 0 d X RANGE)................120 FIGURE 398: STATES OF SYSTEM (3.7) USING LGAC (W=[0 1], LARGE 0 d X RANGE) ................................120 FIGURE 399: CONTROL SIGNAL OF SYSTEM (3.7) USING LGAC (W=[0 1], LARGE 0 d X RANGE)................121 FIGURE 3100: STATES OF SYSTEM (3.7) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .............................124 FIGURE 3101: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 0], SMALL 0 d X RANGE) .............124 FIGURE 3102: STATES OF SYSTEM (3.7) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .............................125 FIGURE 3103: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 1], SMALL 0 d X RANGE) .............125 FIGURE 3104: STATES OF SYSTEM (3.7) USING NGAC (W=[0 1], SMALL 0 d X RANGE) .............................126 FIGURE 3105: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[0 1], SMALL 0 d X RANGE) .............126 FIGURE 3106: STATES OF SYSTEM (3.7) USING NGAC (W=[1 0], LARGE 0 d X RANGE)..............................129 FIGURE 3107: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 0], LARGE 0 d X RANGE) .............129 ix List of Figures (Cont'd) Figure Page FIGURE 3108: STATES OF SYSTEM (3.7) USING NGAC (W=[1 1], LARGE 0 d X RANGE)..............................130 FIGURE 3109: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[1 1], LARGE 0 d X RANGE) .............130 FIGURE 3110: STATES OF SYSTEM (3.7) USING NGAC (W=[0 1], LARGE 0 d X RANGE)..............................131 FIGURE 3111: CONTROL SIGNAL OF SYSTEM (3.7) USING NGAC (W=[0 1], LARGE 0 d X RANGE) .............131 x List of Symbols Symbol Description x system state f , g nonlinear system dynamics A, B linearized system dynamics u system control ℜn ndimensional vector space over the real line C1 space of all realvalued differentiable functions V Lyapunov function P Lyapunov function parameters K controller parameters Q quadratic performance index weighting matrix inf “infimum” of set theory sup “supremum” of set theory min γ minimum rate of convergence X 0 region of interest X region of attraction 0 d X checking set max u maximum control effort • 2norm ε distance function in the region of interest f L system Lipschitz constant E optimization parameters λ minimum eigenvalue λ maximum eigenvalue r p probability of copy c p probability of crossover m p probability of mutation s fitness bias parameter γ μ mean rate of convergence γ σ standard deviation of the rate of convergence u μ mean control effort u σ standard deviation of control effort 1 1 Introduction 1.1 Motivation 1.1.1 CLF Design Theory Linear H2 and H∞ design methods are proven tools for designing control laws for guaranteed optimal and robust performance of linear systems. Zhou (1997) & Levine, et al. (1996) provide an introduction to these subjects. Linear structure allows controller design methods applicable to all linear systems that are controllable and observable. Although nonlinear systems do not lend themselves to a single design method, many engineering systems are linearizable and can be fit into the linear H2 and H∞ design framework. The drawback is that the resulting controllers exhibit degraded performance from being robust against the nonlinearities. A more general and less conservative controller design method for nonlinear systems is the use of control Lyapunov functions (CLFs) (Khalil, 1996), (Krstic, 1995). This approach forces the “energy” of the system to decrease with time. “Energy” is represented by a positive definite function of the system states, called the CLF. As the states increase in magnitude, the CLF increases. As an example, in a mechanical system, energy increases with increasing positions and velocities, such that an appropriate CLF also increases with the positions and velocities. The CLF time derivative can then be made a function of the control input(s), such that any control law that causes the CLF to constantly decrease over time (in a closed set of state space) is guaranteed to drive the 2 states to the equilibrium point(s). If the energy of the system continuously decreases, the system must settle to an equilibrium point. Lyapunov stability theory can be used to find a satisfactory control law from two broad approaches (Kokotovic & Arcak, 2001): 1. A control law is selected and a search for a Lyapunov function is conducted to prove the performance of the closed loop system, 2. A control Lyapunov function is selected, from which a control law is derived that yields certain performance measures. The difference between the two approaches is that the first relies primarily upon analysis, while the second relies on synthesis. Specifically, in the first approach the control law is fixed, such that the performance measures are invariant to the selection of the Lyapunov function. Hence the Lyapunov function acts merely as a tool to guarantee stability, and sometimes estimate the performance measures. In the second approach, the control law arises from guaranteed performance measures of the CLF (i.e. rate of convergence and region of attraction). The problem is that the CLF may not be a good selection for determining the desired performance measure because the CLF topology and parameter values are incapable of yielding “globally optimal” performance, where “global” means the set of all possible CLF’s. The work of Johansen (2000, a & b) offers a computational procedure for generating nonquadratic Lyapunov functions which can be used to estimate the performance of smooth nonlinear systems. The work shows that any Lyapunov function may be represented to arbitrary accuracy by a sufficiently large finite summation of quadratic functions weighted by smooth switching functions. Johansen’s work indirectly supports the need to appropriately tune a CLF so that an accurate estimate of the performance of the system may be obtained and used by the optimization algorithm. 3 1.1.2 CLF Design Examples The manner in which the system states and control signal settle to equilibrium is the focus herein and will be referred to as the performance measure of the pair, control law and CLF. Typical performance measures include the RMS and maximum values of the system states and control signals, the rate of convergence of the system, and the maximum region of attraction, loosely defined as the set of points in state space such that the state returns to equilibrium (defined precisely below). Because of the dependence of the CLF on the control signals, the performance measures depend on both the selection of the CLF and the control law. In the work that follows, we first select the CLF and control law functions and use a numerical method for tuning the parameters of these functions to satisfy the performance requirements. We investigate the simultaneous numerical tuning of both the CLF and the control law parameters. By allowing both entities to vary when searching for a solution, shortcomings of other methods may be overcome. Henceforth, the resulting CLF and control law will correspond to the pair (P,K) where P stands for the vector of parameters of the CLF and K for the vector of parameters of the control law. In some situations, P and K may not be independent of each other due to the selection of CLF and controller topologies (e.g. K BT P 2 = − 1 , B∈ℜnx1 , P∈ℜnxn ). To demonstrate the utility for this approach, consider as an example a parameterized 2nd order linear time invariant system given by the equation , 0 1 0 1 0 1 > ⎥⎦ ⎤ ⎢⎣ ⎡ + ⎥⎦ ⎤ ⎢⎣ ⎡ − − = x u a a x& , [ ] 2 1 1 2 x = x x T ∈ℜ x , u ∈ℜ (1.1) 4 where dot notation indicates time differentiation. We wish to use CLF theory to design a stabilizing control, u, of the states, x. We select a parameterized form of the CLF, V(x), given by , 0 1 0 1 0 ) , ( < < ⎥⎦ ⎤ ⎢⎣ ⎡ − = x b b b V x b xT (1.2) We use this form of CLF so that we may vary the influence of 1 x and 2 x on the CLF. Suppose we seek to find a control such that 2 2 V& (x) = −x (1.3) Computing V& yields V x a b b x x a b x b x u 2 2 1 2 2 & ( , , ) = 2(2 −1) + 2 ( −1) + 2(1− ) (1.4) It is straightforward to show that (1.3) can be satisfied by selecting u as 1 2 2 1 1 ( , , ) 2 1 ab a x b x b a x u ⎟⎠ ⎞ ⎜⎝ ⎛ − + − = + (1.5) The pair (P,K) becomes (b, [a b]). Now suppose we desire to minimize the control effort inside the region defined by { : [ 1 1], [ 1 1]} 1 2 S = x∈ℜ2 x ∈ − x ∈ − (a unit square centered at the origin in statespace) by varying the free parameter b. Considering a fixed, a suitable performance measure could be defined as ∫ ∫ − − = 1 1 1 1 1 2 J (b) u(x, a,b)2 dx dx (1.6) in which we wish to minimize the control effort in S. It may be shown for the ranges of a and b, the minimum of (1.6) exists at the point ⎭ ⎬ ⎫ ⎩ ⎨ ⎧ * = ∈ℜ: = 0 db b b dJ . (1.7) The derivative of J with respect to b is 5 ( ) ( 1)3 2 1 1 3 2 − − + = − b a b db dJ (1.8) In this special case, solving for b* in (1.7) yields a b 2 * = 1− 1 (1.9) We see that the selection of the CLF in (1.2) has an effect on the optimality of the resulting control. Our example minimized the control effort in a region of state space. Had we selected an arbitrary value for b, other than b*of (1.9), a suboptimal controller would have resulted. In addition, 2 a ≤ 1 impliesb* ≤ 0 , violating the requirement 0 < b < 1, hence 1.3 cannot be satisfied if 2 a ≤ 1 . Now consider a similar investigation of a 2nd order nonlinear time invariant system. The system equations are x& = f (x) + g(x)u, f x = [x ]T g x = [ ]T u ∈ℜ ( ) 3 0 , ( ) 0 1 , 1 (1.10) In this example, we begin by selecting Sontag’s Universal Formula (Sontag, 1989) as the control law ⎪ ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎧ = ∂ ∂ ≠ ∂ ∂ ∂ ∂ ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ + ⎟⎠⎞ ⎜⎝ ⎛ ∂ ∂ + ∂ ∂ − = 0 0 0 2 4 g x if V g x if V g x V g x f V x f V x V u , (1.11) where V :ℜ2 →ℜ . Assuming a CLF, V(x), exists, the control law of (1.11) is a smooth globally asymptotic stabilizing control for (1.10) because it forces the state trajectories of (1.10) to follow the 6 gradient of the CLF. A problem with such a control law is the “ g x V ∂ ∂ ” term in the denominator may cause u to become very large. To investigate the role of the CLF, we assume the quadratic function in , 0 1, 0 / 2 sin cos cos sin 0 1 0 ( , , ) θ π θ θ θ θ θ < ≤ < < ⎥⎦ ⎤ ⎢⎣ ⎡ − ⎥⎦ ⎤ ⎢⎣ ⎡ − = x b b b V x b xT (1.12) is a “local CLF” (specifically defined in the next section). Equation (1.12) is general quadratic function similar to (1.2), except that we introduce a new parameter,θ , which rotates the eigenvectors about the origin. The local CLF gradient is ⎥⎦ ⎤ ⎢⎣ ⎡ − − − = ∂ ∂ θ θ θ θ (1 )sin (1 ) cos cos sin b b b b x x V T (1.13) The specific expression for the control law (1.11) now becomes ( ) ( ) ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎨ ⎧ − − = − − ≠ − − + − + − + − + + − = 0 sin (1 )cos 0 sin (1 )cos 0 sin (1 )cos sin (1 )cos cos (1 )sin cos (1 )sin 1 2 1 2 1 2 4 1 2 2 2 3 1 4 1 2 3 1 4 1 θ θ θ θ θ θ θ θ θ θ θ θ f x b x b if x b x b x b x b x b x b x b x x b x b x x b u (1.14) Ensuring that (1.14) does not grow too large might involve minimizing the maximum control effort on some set S by tuning b and θ . Since a closed form optimal solution probably does not exist or is at least unknown, u would have to be computed and evaluated at all the points in S for every iteration of the optimization algorithm. Even for such a relatively simple system and control law, we clearly have a multimodal optimization problem which requires many function evaluations. It is much more desirable to find an optimization algorithm that scales nicely with problem complexity. 7 1.2 Definitions, Problem Statement, and Background Theory 1.2.1 Scope of Systems and CLF’s In the following, we assume differentiable timeinvariant nonlinear systems and controls of the form x& = f (x)+ g(x)u(x,K) (1.15) where x∈ℜn is the ndimensional state vector, K ∈ℜp is the pdimensional control gain vector, u :ℜn+ p →ℜ is the control input, f , g :ℜn →ℜn are the system dynamics, and u, f , g ∈C1 , f(0)=0, where C1 is the space of all differentiable functions. Chapter 2 discusses the specific control laws selected for the study. We restrict the analysis to quadratic Lyapunov functions of the form V (x) = xT Px (1.16) where P∈ℜnxn , P = PT > 0 . The optimization parameters shall be represented by the pair (P,K). Adapted from (Sontag, 1989), a local CLF is defined in this work as a smooth, proper, and positive definite function V :ℜn →ℜ, such that { } inf ( , ) 0 , 0 0 < ∈ℜ ∈ − V x u u x X & , where X 0 ⊂ ℜn ∪{0} (1.17) is the region of interest. The region of interest is defined as a subset of statespace where the controller is optimized that contains a neighborhood of the origin and only one equilibrium point. We shall use “local CLF” and “CLF” interchangeably from this point on in the discussion. 8 1.2.2 Performance Measures The performance measures considered herein are: 1) minimum rate of convergence, min γ ; 2) region of attraction, X ; and 3) maximum control effort, max u . These performance measures are defined below for all x in a region of statespace, X 0 . The minimum rate of convergence, min γ , is defined by the expression (Johansen, 2000 a) ( ) (0) , 0, 0 min 2 1 2 min ≤ > ≥ − x e t c x t c t γ γ (1.18) where, • denotes the 2norm, ( ) , 2 1 V x ≥ c x ( ) ( ), min V& x ≤ −γ V x and c = λ (P) 2 the magnitude of the largest eigenvalue of P. This performance measure means that the norm of the states will converge no slower than an exponential decay with time constant min 2 /γ , and is based on uniform exponential stability. The region of attraction, X, is defined as (Johansen, 2000 a) ( ) ( )⎭ ⎬ ⎫ ⎩ ⎨ ⎧ = ∈ ≤ ∈∂ ξ ξ X x X V x V X 0 0 inf (1.19) where inf denotes “infimum” of set theory and ∂X 0 represents the border of the set X 0 . This definition is based on the fact that the trajectory of the system cannot cross a levelset a Ω of the Lyapunov function, Ω = { ∈ 0 ( ) =α ,α > 0} α x X V x , because the Lyapunov function is a decreasing function of time. The value for α in this case isα (ξ ) ξ V X 0 inf ∈∂ = . Maximum control effort, max u , will be defined as u u x∈X = sup max (1.20) 9 where u is the output of the controller and sup denotes “supremum” of set theory. The utility of this performance measure is that, along with the minimum region of attraction performance measure (1.19), it may be used to design controllers for systems with sensor and actuator saturations by ensuring the saturation levels are avoided. 1.2.3 Problem Statement The objective of this thesis is to demonstrate the effectiveness of tuning both the control law and the local CLF simultaneously to maximize the rate of convergence and minimize the control effort of nonlinear systems. The controller design intent is to find a controller tune specific to the nonlinear system that is less conservative than the tune based on robust linear systems theory. Two control laws, defined in Chapter 2, will be tested: 1. an LQR fullstate feedback control law, and 2. a Sontaglike nonlinear fullstate feedback control law. It is assumed that a quadratic Lyapunov function is a local CLF for the nonlinear systems studied herein. The proposed solution method does not offer a strict guarantee on performance level. However, it is suggested that with enough random performance sampling, the system will achieve the estimated performance level with sufficiently high confidence, making the proposed method a practical solution for realworld controller design. 1.2.4 Literature Review Convex optimization techniques were used in Johansen (2000, a & b) to numerically compute a generalized nonquadratic Lyapunov function for Lipschitz nonlinear systems. The work was based solely on the use of Lyapunov functions as tools to measure the performance of smooth (locally Lipschitz) nonlinear systems. Extensions to using the method for controller design were not addressed, however. Ghaoui & 10 Balakrishnan (1994) proposed the socalled “VK” iteration technique for linear control systems (analogous to the “DK” iteration of musynthesis (Zhou, 1997)), where the control gains, K, are held fixed and the quadratic Lyapunov matrix, P, is found using LMI techniques. This minimizes the time derivative of the Lyapunov function, and then P is held fixed while K is used to minimize the time derivative of the Lyapunov function. Although Johansen’s work was not used for controller design, an extension is fairly straightforward. While not covered herein, an interesting avenue of future research would be to replace the quadratic Lyapunov functions and linear control laws of Ghaoui & Balakrishnans’ VK iteration method with the general nonquadratic Lyapunov functions of Johansen’s method, then employ control laws linearintheparameters to extend the VK iteration method to Lipschitz nonlinear systems that are affine in the control law. Johansen’s method exploits the structure of the selected Lyapunov function, which is linearintheparameters. If a control affine system and a linearintheparameters control law were assumed, the controller parameters would also show up linearly in V& , so that a convex optimization method could be used to alternately tune both the CLF and controller parameters. 1.2.5 Restrictions of Current Methods Ghaoui & Balakrishnans’ method is restricted to quadratic Lyapunov functions and linear systems. The possible extension to Johansen’s method outlined above is restricted to affine in the control nonlinear systems with linearintheparameters control laws. In addition, these methods require that the CLF and the control law be tuned separately. Fixing one set of parameters and tuning another is a restriction in the optimization method, which may hinder the parameter search. Considering the 11 optimization mechanism known as “hill climbing”, holding some parameters constant while varying others is analogous to traveling up the hill in alternating orthogonal directions, whereas, tuning parameters simultaneously is analogous to following the gradient all the way up the hill. Following the gradient can be a more efficient search mode because it is instantaneously the fastest way to increase elevation. The approach taken herein exploits the interaction between the CLF and the control law as they are both varied simultaneously, with the intention of finding better controllers and/or a better optimization method. 1.2.6 Checking Performance Measures for Nonlinear Systems The performance measures must be met everywhere in X 0 . However, checking every point is impossible because X 0 , although a compact set, is a dense uncountably infinite set of points. We must therefore find a way to sample X 0 and check the performance measures on a discrete finite subset, 0 d X , while guaranteeing that the points between the “checking points” of 0 d X also satisfy the requirements. The following is taken from “Theorem 2” of Johansen (2000 a) and may be used to guarantee the Lyapunov conditions are met for all x∈ X 0 given that they are satisfied for all 0 d x∈ X . We first present some preliminary definitions used in the theorem. Define the checking set density function ε (x) by d x X x x x d d = − ∈ 0 ε ( ) inf (1.21) which is the distance from some point in the region of interest, x∈ X 0 , to the closest neighbor in the checking set, 0 d d x ∈ X . Define the Lipschitz constant for f as f L . The size 12 parameter will be defined as sup ( , ) 0 , S f x k x∈X k∈Κ = , where Κ is the set of admissible values for the control gains, P = λ (P), and X x y x y X = − , ∈ 0 0 sup . Theorem 1.1 (adapted from Theorem 2 of Johansen (2000 a)) Suppose X 0 is a compact set, V (x) > 0 for all x∈ X 0 − {0} and f is a bounded, locally Lipschitz function. Let γ > 0 be given and suppose there exists an α > γ > 0 such that for all 0 d x∈ X V& (x) ≤ −αV(x) (1.22) Assume the checking set grid density is such that ( ) Q (x) V(x) min ε ≤ α −γ (1.23) where Q 2SP 2X 0PL 2 PX 0 f = + + α (1.24) Then for all x∈ X 0 ( ) ( ) min V& x ≤ −γ V x (1.25) Proof: (see Johansen 2000a) Johansen’s theorem allows us to ensure stability for the points not sampled. The theorem suggests that if the state space is sampled fine enough, then an accurate estimate of min γ may be achieved. The required “closeness” of the sampling points is related through Q , a measure of how fast the system states change (proportional to the system Lipschitz constant). By (1.22), the size of min γ in (1.25) is directly related toα , ε , V, and Q. We note, however, that computing Q is very difficult in practice because for highly 13 nonlinear systems, finding f L can be very difficult. In addition, both P and f L are variables in the search for (P,K). Therefore any P and f L values used initially that satisfy (1.23) might change enough during the search to invalidate (1.23), causing the need to refine the checking grid and possibly invalidate the progress made by the search algorithm. We therefore propose a more practical method to fix ε (x) = ε ,ε > 0 and estimate min γ via random sampling of X 0 (described in Chapter 2). The estimation of min γ of course occurs after a (P,K) pair is found that satisfies (1.22). Finding a positive value of min γ does not imply that the performance measures of (1.17)(1.19) have been met. However, it does imply the satisfaction of performance measures of the form defined by (1.18)(1.20), in the following ways: 1) a positive value of min γ is precisely a minimum rate of convergence (1.18); 2) min γ is defined on some region of X 0 containing the origin which must contain a connected level set of the Lyapunov function whose interior’s minimum rate of convergence is determined by min γ in (1.18), assuring the region X 0 contains a minimum region of attraction (1.19); and 3) u has an upper bound on X since u is differentiable and X is bounded. In other words, finding a (P,K) pair which yields 0 min γ > means that on X 0 , a decay rate, a region of attraction, and a maximum control effort exists, but not necessarily satisfying the desired amounts. 14 A Method for Relaxing the Checking Grid Density Requirements Around the Origin Theorem 1.2 Suppose we have a system of the form (1.15) and CLF of the form (1.16). Let X 0 be a compact set containing the origin and, without loss of generality, assume the system is locally stable and u=0. Define =0 ∂ ∂ = x x A f and { } V V x X − & = ∈ − 0 min 0 γ inf (1.26) { } ( ) x Px x A P PA x T T T x X L − + = ∈ − 0 min 0 γ inf (1.27) Then as sup 0 , 0 0 = − → ∈ X x y x y X , L min min γ →γ (1.28) Proof: The Taylor series expansion of f about the origin is f (x) = Ax + D(x) (1.29) where D(x) represents the higher order terms of the expansion. The time derivative of V becomes = V V& ( ) x Px x A P PA x x PD x V D x x Ax V x V T ( ) − T T + − 2 T ( ) ∂ = ∂ + ∂ ∂ (1.30) decomposing min γ into two parts { } ( ) ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ − + − + = ∈ − x Px x PD x x Px x A P PA x T T T T T x X inf 2 ( ) 0 min 0 γ (1.31) and using the facts that inf ( ) inf ( ) , , a b a b a b a b + ≤ + and a b a b a b a inf ( + ) ≤ inf ( ) + , , and the definition of L min γ (1.27) one may arrive at the inequality 15 { } ( ) { } ( ) { } ( ) x Px x PD x x Px x PD x x Px x A P PA x x Px x PD x x Px x A P PA x x Px x PD x x Px x A P PA x T T T L T T T T x X T T T T T x X T T T T T x X inf 2 ( ) 2 ( ) inf 2 ( ) inf 2 ( ) 0 0 0 min 0 0 0 + = + ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ − + ≤ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ + − + ≤ ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ − + − + = ∈ − ∈ − ∈ − γ γ (1.32) The inequality (1.32) implies x Px x PD x T T L 2 ( ) min min γ −γ ≤ (1.33) Since the lowest order terms in the polynomial D(x) are quadratic, the lowest order terms in the polynomial “ xT PD(x) ” are cubic ensuring that as sup 0 , 0 0 = − → ∈ X x y x y X , ( ) →0 x Px x PD x T T implying L min min γ →γ . ■ Because of the large number of sample points usually required to obtain a reliable estimate of min γ , it is more efficient to compute the eigenvalues of P and AT P + PA and instead use the bounding relationship ( ) ( ) L T P A P PA min γ λ λ ≤ − + (1.34) By Theorem 1.2, for a sufficiently small region about the origin, the difference between min γ and L min γ is small, making the left side of (1.34) a good estimate of min γ for points near the origin and dramatically reducing the required size of the checking set 0 d X . 1.3 Thesis Outline Chapter 1 motivates the search for an optimization method for tuning CLF’s and their corresponding control laws. Chapter 2 poses the specific optimization problem and 16 describes the specific genetic algorithm used to solve the optimization problem. Chapter 3 provides examples of how to use the genetic algorithm of Chapter 2 to tune full state feedback CLF controllers. Chapter 4 provides an outline of future research that explores the addition of uncertainty and adaptation in the control systems, and also addresses the use of more generalized CLFs, as well as discrete time control. 17 2 A Genetic Algorithm for CLF Optimization 2.1 Genetic Algorithm Motivation Genetic algorithms (GA’s) are ideal candidates for a general optimization method for solving the dual tuning problem described in chapter 1. The reader is referred to Flemming & Purshouse (2002) for an excellent survey on evolutionary algorithms in control engineering. GA’s are parallel global stochastic search algorithms that do not depend on derivatives to perform the optimization – their most attractive feature. The stochastic nature of the algorithm allows it to make random jumps into hardtoreach regions of search space that may contain a local extrema. These regions would otherwise be inaccessible using only local gradient information. Metaphorically speaking, they are capable of jumping over valleys onto other mountains in search of the highest peak. The parallel search feature emerges from the population of search points being spreadout among the parameter search space. This allows the simultaneous exploration of several regions in search space containing local maxima. Nondifferentiable fitness functions may be used as the optimization objective function because the search movements are not based on gradients, but occur from either random jumps called mutations or selective combinations of two or more highly fit individuals called crossovers. The crossovers effectively serve as interpolations and extrapolations of the existing search points in the GA population. The GA is capable of making fast progress in the parameter search effort for difficult problems. However, unlike gradientbased methods, GA’s lack a guarantee of 18 making steady progress towards a local maxima. A remedy to this problem is decreasing the mutation step size. This mimics the small incremental progress made by a gradient method because a small random jump in some situations is likely to have a positive projection upon the gradient direction so that a step in the right direction is made. Gradient and other methods that use only local objective function information are not attractive here for three major reasons: (1) the multimodal nature of the objective function leads to convergence to local extrema; (2) the objective function evaluation points drift in time (explained below), giving the objective function a time varying nature which calls for a variable step rate to balance numerical stability with making fast enough progress, adding an unnecessary degree of difficulty to the problem; and (3) the nondifferentiability of an objective function is incompatible with a gradient method calling for the use of a finite difference approximation to the gradient which can be inefficient for a large parameter space (many function evaluations must be made just to move a small amount in parameter space). A general rule of thumb in ensuring GA’s make sufficient progress is that the population have sufficient size and time. This typically makes them inefficient for searching low dimensional parameter spaces. However, outweighing this drawback is their ability to make fast progress in large parameter search spaces, as well as their ability to optimize both the parameters and topology of functions, as with genetic programming, a subset of genetic algorithms (Koza et al., 1999). 2.2 Tailoring to the Specific Problem 2.2.1 Optimization Objectives The objective of the optimization algorithm is to find a (P,K) pair that satisfies the exponential stability conditions 19 ( ) 0 2 V x ≥ v x v > (2.1) and ( ) ( ) 0 min min V& x ≤ −γ V x γ > (2.2) for all sampled points 0 d x∈ X . In addition to (2.1) and (2.2), we also consider the maximum allowable control effort on the sample set labeled max u , u u x Xd0 sup max ∈ ≥ (2.3) We define an admissible (P,K) pair as the set of parameter values that make the system (1.15) and Lyapunov function (1.16) satisfy (2.1) , (2.2), and (2.3) for some user specified triple ( ) min max v,γ , u on the checking set 0 d X . Typically, desired values of ( ) min max v,γ , u are not known, so they are allowed to “float” with progressive improvement during the optimization, and the user decides if the values obtained are good enough at the end of the optimization run. 2.2.2 Simplifying the Search We restrict our investigation to quadratic Lyapunov functions. Due to computational restrictions and the use of Theorem 1.2 to reduce the required size of 0 d X , we also restrict the design optimization to closed convex sets about the origin. To add an additional degree of local robustness and optimality to the closed loop nonlinear system, we also assume the CLF is locally 2 H inverse optimal (Kokotovic & Arcak, 2001). That is, the CLF matrix P is the solution to the LQR problem for the linearized system for some set of states and control effort weighting matrices in the objective function. The 20 problem reduces to using the GA to find E ∈ℜnxn such that Q = ET E + cI , c ≥ 0 (2.4) so that λ (Q) ≥ c . We compute a P = PT > 0 that satisfies the HamiltonJacobiBellman (HJB) equation (also known as the Algebraic Riccati Equation) for linear time invariant systems (Levine et al., 1996) ATP + PA − PBBTP + Q = 0 (2.5) The solution to (2.5) satisfies the minimization of the performance index ∫∞ = + t J xTQx u2dt when we assign the control as u = −BT Px (2.6) We thus use E to generate the CLF V = xTPx . An additional utility to the approach is the direct computation of a full state feedback linear control (2.6), if such a control law is desired. One may argue that selecting (2.6) for the control is like using the gradient direction of V normalized to satisfy the HJB equation (Primbs, et al 2000). Therefore, similar to the argument presented in Theorem 1.2, any control law that approaches u = −BT Px as x approaches the origin will also be a locally optimal controller for the nonlinear system. In addition, because we sample the performance index for the nonlinear system on the set 0 d X , the controller shall be optimal on 0 d X . In summary, we attempt to maximize the rate of convergence and minimize the control effort on 0 d X using controllers whose linearization is inverse optimal for the linearized system dynamics. A more general version of the control (2.6) based on the proposed control of Primbs, et al (2000) is 21 ⎪ ⎪ ⎪ ⎪ ⎩ ⎪ ⎪ ⎪ ⎪ ⎨ ⎧ = ∂ ∂ ≠ ∂ ∂ ∂ ∂ ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ + ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ + ∂ ∂ − = 0 0 0 ( ) 2 2 g x V g x V g x V g x f q x V x f V x V u (2.7) The control is a modified version of Sontag’s Universal Formula (Sontag, 1989) for nonlinear systems that are single input and affine in the control. The performance index minimized by (2.7) has the form ∫∞ = + 0 J q(x) u2dt , q :ℜn →ℜ, q(x) > 0,∀x ≠ 0,q(0) = 0 , and arises from the solution of a more general version of the HJB equation (Primbs, et al 2000) ( ) 0 4 1 2 = + ⎟⎠ ⎞ ⎜⎝ ⎛ ∂ ∂ − ∂ ∂ g q x x f V x V (2.8) Notice (2.7) and (2.8) are equivalent to (2.6) and (2.5), respectively, for the case of LTI systems. Using the control of (2.7) with a quadratic CLF and the linearized dynamics of the nonlinear system reduces to the LQR control of (2.6). Therefore (2.7) combined with a quadratic CLF and the nonlinear system dynamics yields globally asymptotically stable dynamics with local optimality. In this work, we shall consider (2.6) only for linear controllers and (2.7) only for nonlinear controllers. 2.3 Genetic Algorithm Description The objective of the genetic algorithm is to maximize the fitness function which is used to measure how well the controller performs on X 0 . We choose to maximize the minimum rate of convergence while minimizing the maximum control effort on the discrete checking set 0 d X . Therefore, the fitness function is defined as 22 ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − − + + ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − + = u u u iu i i w w w w w Fitness E w φ φ φ φ φ φ φ φ γ γ γ γ ( ) 1 1 2 2 1 2 1 , , 0 1 2 w w > (2.13) where ⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ = − ∈ ( , ) ( , ) min0 i d i d x X i V x E V x E d d & γ φ , ( ( i )) d x X iu u x E d d max , ∈ 0 φ = , i γ i γ φ = minφ , i i γ γ φ = maxφ , iu u i φ φ min = , iu u i φ = maxφ . The Ei is left in the function arguments to remind the reader that the ith CLF and control are dependent upon the ith matrix Ei defined by (2.4) in the population. The two major components of the fitness function, ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − γ γ γ γ φ φ φ i φ and ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − − − u u u iu φ φ φ φ 1 , are considered subfitness functions and are used to normalize the rate of convergence and control effort to eliminate numerical problems due to scale mismatch between these two quantities. The selection process tends to select species good at achieving both high rate of convergence and low control effort. The weights, w, are used to emphasize more selection pressure towards a high rate of convergence, γ φ , or low control effort, u φ . They may be left as constants or changed dynamically as the algorithm progresses. In fact, it was found that pulsing the w’s dynamically helps control the composition of the GA population so that members of the population or species that satisfy one subfitness function well are combined with those that satisfy the other subfitness function well, speeding up the search process for species that do both tasks well. Future work could quantify this improvement. To aid in understanding the interdependencies of the variables involved with the fitness function, the diagram (Figure 21) below illustrates the computational flow of the fitness function. 23 u V V E Q P or V x Px V Vf x u Q E E cI A P PA PBB P Q T T T T (2.7) (2.8) ( , ) 0 → → → → → = =∇ = + + − + = & & ( ( ) ) ( ) max min max min max , ( , ) ( , ) min 0 0 Fitness E u u iu u x E i V x E V x E iu i iu i i i i i i d xd X d i d i d xd X d → → → → → → → ∈ ∈ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ − φ φ φ φ φ φ φ φ γ φ γ φ γ γ γ & Figure 21: Flow of Fitness Computation The genetic algorithm used in this work, employs the basic evolutionary operators “copy”, “crossover”, and “mutate”. They are defined as Copy(E) = E (2.9) Crossover(E1,E2 ) = βE1 + (1− β )E2 (2.10) whereβ is a uniformly distributed random number on the interval [0 2]. E Mutate(E) = E + Δ (2.11) where E Δ is a uniformly distributed random matrix with dimensions matching those of E. The elements of E Δ lie on the interval [ ] E E −σ σ where E σ is the user defined mutation step size of E. The Copy operator (also known as reproduction in the genetic algorithm literature) is used to preserve well fit individuals in the population so that they may be used in future generations for generating better individuals via the Crossover or Mutation operators. The Crossover operator takes two highly fit individuals and either creates a new individual that interpolates between the two when β < 1 or extrapolates along the line connecting the two individuals in E search space whenβ > 1. The Mutation operator takes a single highly fit individual and creates a new individual via randomly perturbing its elements about the hypercube of width E Δ . 24 To demonstrate the simplicity of imposing constraints on the GA optimization and to further support the use of a genetic algorithm, we place an additional constraint on the control gains. All the elements of the LQR gain vector of (2.6) must be limited in magnitude by some upper bound, max K . This constrains the control effort as well as focuses the search to areas in parameter space that yield physically realizable control signals. For the nonlinear control law (2.7) this upper bound affects the gains of the linearized controller. By (2.4) and (2.5), decreasing the norm of E will decrease the norm of K. Therefore we select the following filter on E to ensure the maximum gain constraint is satisfied: ⎩ ⎨ ⎧ > = = E otherwise E K K Gain Limit Filter E i 1,2 n i max max _ _ ( ) K η , 0 <η < 1 (2.12) The genetic algorithm uses this adhoc filter to update E such that controller gain limitations are enforced: if a gain is too high, then E is scaled down, indirectly scaling down the entire gain vector. The GA performs a parallel search of the parameter search space (Jamshidi, et. al). That is, rather than search via a single point at a time, as with typical optimization methods, an entire set of points (i.e. the population) are considered at once. Members of the population are randomly selected for evolutionary operation to create the next population (or to move the population as a whole though the search space). To bias the population motion towards “progress”, highly fit individuals are more likely to be chosen for evolutionary operation. The members of the population are first arranged by fitness in ascending order. The individuals with the highest fitness are placed at the beginning of the ordered list, denoted { } fit pop I ⊂ 1,2,K, N , where pop N denotes the size of the 25 population. The probability density function (pdf) for the selection of the ith member of fit I is ( ) , 0 1 > ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ∈ = Σ= s N j N i P I Selected Npop j s pop s i pop fit (2.14) where s acts as a fitness bias parameter. The form of the pdf of (2.14) was selected for the following reason. By changing s, we may adjust the size of the most likely portion of the population we use during the selection process. The higher s becomes, the more we restrict selection to highly fit individuals. For example, if s = 0 then we have a uniform probability of selecting any member of fit I . If s is very large, the only member of fit I likely selected is the individual with the highest fitness, i.e., the first member in the ordered list fit I . Once a species is selected, a random decision on which operation to perform must be made. The probability for Copy, Crossover, and Mutation are labeled r p , c p , and m p , respectively. The restrictions for these parameters are 1 , , 0 + + = ≥ r c m r c m p p p p p p (2.15) Typically we insert the best fit individual into the next generation and set = 0 r p . This is because two copies of the same individual are often selected for the Crossover operation, which results in another copy of the two same individuals, implicitly implementing the Copy function. The usual settings for the evolutionary operation probabilities in this work are = 0, = = 0.5 r c m p p p . 26 2.3.1 Making 0 d X Dynamic The required size of the checking set 0 d X grows exponentially with the dimension of the system state, hence the need to strategically assign 0 d X such that a minimal number of points is used. The objective is to maximize the minimum value of −V& /V on 0 d X , i.e. maximize min γ , and to minimize the maximum value of u on 0 d X , i.e. minimize max u . For all practical purposes, if we knew where in X 0 these max/min and min/max conditions occurred, we could define 0 d X as the set of these points, denoted critical points, so that the fitness function is evaluated only at the points that matter, i.e. the critical points. The argument is that we don’t have to check all the points in X 0 , just the critical points. Unfortunately, the critical points for each species in the population are unknown and are also functions of (P,K), hence functions of E and change after every generation. The clusters of E’s in any population are perturbations of an average E for that cluster, so taking the critical points for each E on the set 0 d X is a way to create a perturbation cluster of critical points labeled d C . The set d C represents the best estimation of where the population of controllers (i.e. the E’s) are yielding the minimum rate of convergence and maximum control effort on X 0 . Therefore, it makes sense to evaluate the controllers of the next generation at these points. Hence, at the beginning of every generation, we redefine the checking set as d d d X 0 = C + R : the critical points from the last generation ( d C ) plus a set of randomly selected new points, d R , that act as “exploration points” for finding better critical points for the current population. Better 27 means yielding a lower fitness value than any point in d C . This method reduces the needed size of 0 d X and increases the speed of the GA. 2.4 Stochastic Estimation of min γ Once a satisfactory (P,K) pair is found, the stochastic checking method is used. The sampling method used to estimate the value of min γ (1.18) is based on the Chebyshev inequality of probability theory (Resnick, 1999)(Stark & Woods, 1994). We assume the probability density function (pdf) of γ is a uniformly distributed random variable, Γ , with mean, Γ μ , and variance, 2 Γ σ . The pdf for Γ is ⎪⎩ ⎪⎨ ⎧ − ≤ ≤ + = Γ Γ Γ Γ Γ Γ otherwise z p z 0 3 3 2 3 1 ( ) μ σ μ σ σ (2.16) This type of pdf is chosen because it is reasonable to assume we know only that the distribution of Γ is bounded from above and below, when we uniformly sample it on X 0 . Future work could use a better estimate of the distribution of Γ by using the actual function V V& − , since it is a known function of x. The Chebyshev inequality may be expressed as [ ] 2 2 ˆ ε σ μ μ ε n P Γ Γ Γ − ≥ ≤ (2.17) where P(A) denotes probability of event A and Σ= Γ = Γ n i n i 1 μˆ 1 (2.18) 28 is the maximum likelihood estimate of Γ μ (Stark & Woods, 1994), i.e. the sample average of Γ . If we knew the value of Γ σ then we could use (2.17) to specify a probable lower bound on γ , which from the pdf (2.16) would be ( ) Γ Γ ∈ γ = inf γ = μˆ −ε − 3σ min 0 x X (2.19) The lower bound probability of (2.17), denoted γ min P , is a free parameter which may be specified by setting n large enough in (2.17), i.e. 2 min ε σ γ P n ≥ Γ (2.20) We do not have the luxury of knowing Γ σ , however, such that an upper bound of Γ σ must be estimated. The maximum likelihood estimate of Γ σ (Stark & Woods, 1994), denoted Γ σˆ , is given by ( ) Σ= Γ Γ = Γ − n i n i 1 σˆ 1 μˆ 2 (2.21) along with its uncertainty, uσˆ , may be used to express the total value of Γ σ , given by σ σ σ ˆ = ˆ + u Γ Γ (2.22) By Taylor series expansion of continuous functions, an approximation for the bound of uσˆ (for sufficiently smallε ) is ε μ σ σ Γ Γ ∂ ∂ ≤ ˆ ˆ u ˆ (2.23) where the term “ Γ Γ ∂ ∂ μ σ ˆ ˆ ” may be thought of as the sensitivity of Γ σˆ to the Γ μˆ estimate, and ε as the uncertainty of Γ μˆ . The absolute value of Γ Γ ∂ ∂ μ σ ˆ ˆ is used to make a conservative 29 estimate of Γ σ . Computing Γ Γ ∂ ∂ μ σ ˆ ˆ and substituting it, along with the right sides of (2.21) – (2.23) into (2.20) yields ( ) ( ) 2 1 1 2 2 min ˆ ˆ ε μ ε μ γ P n n i i n i i Σ Σ = Γ = Γ Γ − + Γ − ≥ (2.24) The result is a transcendental inequality, which when satisfied, guarantees that for small enough ε , ( ) ( )⎟ ⎟⎠ ⎞ ⎜ ⎜⎝ ⎛ = − − Σ Γ − + Σ Γ − = Γ = Γ Γ n i i n i n i 1 1 2 min γ μˆ ε 3 μˆ ε μˆ (2.25) with probability min 1 γ − P . To solve (2.24) we specify the desired values of ε and γ min P , and guess a value for n. The right side of (2.24) is then computed and the inequality is checked. If the left side is greater than the right side, a larger value of n is chosen. This process repeats until n is large enough. After (2.24) is satisfied, (2.25) is computed. If min γ is positive, then we have, with probability min 1 γ − P , an exponentially stable control law on the domain X ⊂ X 0 with a Lyapunov function to prove it. Equation (2.25) in practice turns out to be an overly conservative estimate of min γ , however. This is because Γ is not truly a uniformly distributed random variable. It is a nonlinear function of a uniformly distributed random variable, making Γ a nonuniformly distributed random variable (Stark & Woods, 1994). We can roughly estimate min γ directly by taking the lowest value of Γ while sampling X 0 . Therefore, a better estimate of min γ can be made by computing the uncertainty of the estimate and using it to compute a worst case min γ . Hence we define the sample minimum of Γ , min Γ given by 30 Γ = (Γ) ∈ 0 min min x Xd (2.26) and the estimate of min γ as min min γ = Γ (2.27) We have used the stochastic arguments above to derive a reasonable estimate of the size of the checking set for postGA numerical estimation of the minimum rate of convergence. We shall assume that this checking set is sufficient for checking the maximum magnitude of the control signal as well. Note that for pdf’s with substantially higher order moments (moments beyond the mean and variance), we expect this estimate to be invalid. It is, however, a good starting point for the size of the checking set and could be increased until the estimates of min γ and max u do not change substantially. Randomly sampling min γ and max u as discussed above, along with a few numerical simulations usually suffices as a confidence builder to the control engineer that the controller is indeed stable and with the allowable maximum control effort. 2.4.1 GA Flowchart The flow chart of the algorithm is given below in Figure 22. At the beginning of the algorithm, a randomly generated set of E’s is inserted into the first population (i.e. generation 0). The set of E’s is used to compute a set of (P,K) pairs, which in our case means solving the HJB equation (2.5) for P and using it to solve for K depending on the selection of the control law (2.6) or (2.7). The fitness computation of Figure 2.1 follows, and the fitness of the population is used to select the best individuals for performing the evolutionary operations defined in (2.9) through (2.11). The E with the highest fitness denoted “best E” is returned along with the corresponding (P,K) pair. The controller 31 performance, i.e. minimum rate of convergence and maximum control effort are numerically estimated by randomly sampling X 0 using the sample size defined by (2.24). If the performance is good enough then the CLF and controller are accepted as a solution to the optimization problem or used possibly in some other performance simulation. If the performance is unacceptable, the GA parameters are tuned or the controller performance requirements are relaxed and the process is repeated. In the next chapter we use the ideas outlined above to tune the linear and nonlinear controllers of (2.6) and (2.7), respectively, for a set of nonlinear systems that are relatively difficult to control. Figure 22: GA Optimization Algorithm Generate New Population of Random E’s Fitness of E’s Checking Set & Nonlinear Dynamics: 0 d X , f,g Evaluate Fitness’s Selection, Evolutionary Operations, and Gain Limits Generation # = max #? Numerical Checking of Controller Performance yes no Generate Population of Random Mutations of best E best E Beginning of algorithm ? yes no Performance Good Enough? Finished yes no Adjust GA parameters Compute (P,K) using E Linearized Dynamics: A & B 32 3 Full StateFeedback CLF Control 3.1 Selection of Benchmark Examples The purpose of this chapter is to demonstrate how to use the ideas of Chapter 2 to tune fullstate feedback CLF controllers. The reader is reminded that the discussion is limited to the two controller types of (2.6) and (2.7) relisted below: Linear (2.6): u = −BT Px (3.1) Nonlinear (2.7): ( ) ( )( ) ⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ = ≠ + + − = 0 0 0 2 2 x Pg x Pg x Pg x Pf x Pf x Qx x Pg u T T T T T T T (3.2) where P is the solution to the algebraic Ricatti equation ATP + PA − PBBTP + Q = 0 using the linearized dynamics (A,B) of the nonlinear controlaffine system defined by x& = f (x) + g(x)u , x∈ℜn ,u ∈ℜ, f , g :ℜn →ℜn , and the selected Q matrix defined by the quadratic integral performance index ∫∞ + 0 xTQx u2dt . The following example systems are taken from Ngamsom (2001). The theory developed by Ngamsom is based partially on maximizing the region of attraction (region of stability) of nonlinear systems using linear control laws. The drawback of the method proposed by Ngamsom is the controller is synthesized using an uncertain linearized model of a nonlinear system. Therefore, the theory is based on linear control of nonlinear systems, i.e. only local information about the system dynamics and worst case 33 assumptions on the effects of the nonlinearities are used. The controller design intent is to find a controller tune specific to the nonlinear system that is less conservative than the tune based on robust linear systems theory. We use the genetic algorithm described in Chapter 2 to tune linear controllers (3.1) and compare with those of Ngamsom. For nonlinear control, we use the augmented Sontag’s universal control law (3.2) from Primbs, et al (2000) because of its inherent global stability and local inverse optimality. However, we show how the genetic algorithm’s tuning of the CLF and quadratic performance index q(x)=xTQx results in using less control effort than the CLF resulting from linear dynamicsbased methods. The main reason for the selection of the nonlinear control law is that it converges to the linear control law in a sufficiently small neighborhood about the origin. This allows the P matrix of Ngamsom’s linear control method to be used directly in the nonlinear control law for an additional mode of comparison between Ngamsom’s controller and the genetic algorithm tuned CLF controller. Ngamsom’s linear control method happens to be a very effective design procedure for a robust linear controller. It is considered a good benchmark for the methods proposed in this thesis. 3.1.1 Comparing GAC to LARC To facilitate comparison between Ngamsom’s work and ours, we shall refer to the controller and CLF resulting from the method proposed herein as the “Genetic Algorithm Controller” (GAC). The GAC implementing the control law from (3.1) shall be referred to as the linear GAC (LGAC) and the GAC implementing the control law from (3.2) shall be referred to as the nonlinear GAC (NGAC). The linear controller that maximizes the region of stability is called the Lyapunov Attractive Region Controller (LARC), after 34 Ngamsom. Similar to the GAC terminology, the LARC implementing the control law from (3.1) shall be referred to as the linear LARC (LLARC) and the LARC implementing the control law from (3.2) shall be referred to as the nonlinear LARC (NLARC). Ngamsom’s design method is a systematic method for designing linear controllers for a class of nonlinear systems. The method uses the linearized dynamics of the nonlinear system along with assumptions on how the nonlinearities, treated as uncertainties, affect the linearized system. We observe that the size of the region of stability is a “sideeffect” of the LARC. That is, the region of stability of the LARC is not explicitly selected – it is an inherent result of the combined nonlinear system and linear controller. The proposed GAC method offers the ability to explicitly select the region of stability, in addition to direct tuning of both the minimum rate of convergence and maximum control effort. This freedom comes with a price, however. For all of the examples examined in this study, the region of stability for the GAC was smaller than the region of stability for the LARC overall. On the other hand, the GAC allows the flexibility to select where in state space to attempt to achieve stability, as well as the ability to vary the rate of convergence while varying the control effort in this region. 3.1.2 Displaying Results We display the performance of our example controllers in the region defined by X 0 in a set of Tables. The Tables list the estimated values for the minimum rate of convergence min γ , the mean rate of convergence γ μ , the standard deviation of the rate of convergence γ σ , the maximum control effort magnitude max u , the mean control effort u μ , and the standard deviation of control effort u σ . These quantities are estimated by uniform random sampling in X 0 . Equation (2.24) is used to determine a sufficient sample 35 size. For the second order system of the first example, we include 3D plots in X 0 of the level sets of the Lyapunov function, the value ofγ , and the locations of the estimated critical points defined as the location estimates where min γ and max u occur for every controller in the GA population. The 3D plots aid in visualizing the effects the CLF shape has on the rate of convergence and the locations of the critical points. We leave blank the rate of convergence plot where it is negative or very close to the origin where it is numerically illconditioned. The blank spots where the rate of convergence is negative depict “holes” where the system becomes unstable with respect to the specific CLF. 3.2 Example 1: Artificial System The equations of motion for the artificial 2nd order system considered by Ngamsom (2001) are: ( ) u x x x x x x x x x ⎥⎦ ⎤ ⎢⎣ ⎡ + + ⎥⎦ ⎤ ⎢⎣ ⎡ + + + − = cos 1.5 0 5 sin( ) 10 10 10sin( ) sin( ) 2 2 2 1 2 2 1 2 & 1 2 1 , [ ] ∈ℜ = ∈ℜ u x x x T 2 1 2 (3.3) The linearized dynamics about the origin are: u x x ⎥⎦ ⎤ ⎢⎣ ⎡ + ⎥⎦ ⎤ ⎢⎣ ⎡ = 2.5 0 0 1 10 10 & (3.4) Table 31 displays the results from the LARC method. The Lyapunov function matrix P, the linear control gains K, and the matrix for the quadratic performance index Q, are listed. The structure of Q in Ngamsom’s work is typically fixed as a diagonal matrix, and the norm of Q is tuned via an optimization method. One advantage of the proposed GAC method lies in its ability to find a more general Q (nonzero offdiagonal elements) to optimize the performance index. 36 P K Q 4160.8 125.6 125.6 54.4 157.49 68.244 16000 0 0 16000 Table 31: LLARC Parameters for System (3.3) (Ngamsom, 2001) Table 32 displays the randomly sampled performance of the linear LARC in Table 31 on the artificial system (3.3). The range of X 0 is selected to be [ 25 25] 1 x ∈ − and [ 100 100] 2 x ∈ − . Note the negative value for min γ , implying instability. X0 Range min γ γ μ γ σ max u u μ u σ [ ] [ 100 100] 25 25 2 1 ∈ − ∈ − x x 76.935 47.769 80.6 10709 3805.6 2515.5 Table 32: Performance of LLARC on System (3.3) The randomly sampled performance of the nonlinear LARC on the artificial system (3.3) is displayed in Table 33. The minimum rate of convergence is positive by design, and the control effort is very high. X0 Range min γ γ μ γ σ max u u μ u σ [ ] [ 100 100] 25 25 2 1 ∈ − ∈ − x x 1.0294 135.75 128.340 1.4878x107 17928 1.6251x105 Table 33: Performance of NLARC on System (3.3) Table 34 displays the randomly sampled performance of the LLARC on the linearized artificial system dynamics. This Table is used to quantify the local performance of both controllers about the origin. The Table allows the comparison between the linear and nonlinear behavior of the system and helps quantify the effects the nonlinearities have on the system performance. 37 X0 Range min γ γ μ γ σ max u u μ u σ [ ] [ 100 100] 25 25 2 1 ∈ − ∈ − x x 3.7641 91.364 93.507 10709 3796.5 2518.9 Table 34: Performance of LLARC on Linearized System (3.4) Consider the level sets of the CLF using P from Table 31 in Figure 31. The shape and orientation of the elliptical “rings” determine the path of the state trajectory when under the CLF control law. A true CLF control law forces the state trajectory to cross into successively smaller rings. The trajectory in Figure 31 actually increases the CLF value initially, yet eventually converges to the origin. This behavior explains the negative value for min γ in Table 32. The CLF is therefore only a local CLF and not a CLF on X 0 . The system is unstable with respect to the CLF, but may be locally stable for some other Lyapunov function; perhaps one whose level sets are turned slightly more counterclockwise so that the state trajectory does not cross into a higher level set. Figure 31: Level Sets of V for LLARC on System (3.3) 38 The rate of convergence of the CLF from Figure 3.1 is displayed in Figure 32. The points in X 0 where γ < 0 or points close to the origin that are numerically illconditioned due to division by a very small number, are left blank. The points whereγ < 0 are points where the trajectory crosses into larger level sets and exit X 0 , possibly never to return. We see that the region where the trajectory crosses into successively higher level sets, the rate of convergence is negative as expected. Figure 32: “Gamma” (γ ) for LLARC on System (3.3) Figure 33 is an estimated region of attraction for the controller reported in Ngamsom (2001) using Monte Carlo simulation. The shaded blocks represent regions that yield a stable trajectory when used as the region of the initial state. All other blocks represent regions that yield an unstable trajectory when used as the region of the initial state. A Lyapunov function with level sets that line up with the edge of stability in Figure 33 (where white and black boxes are in contact) would be a better selection to prove the 39 true region of stability for the artificial system (3.3) via Lyapunov Stability Theory. The values for γ here are negative, supporting the results of the Monte Carlo simulation used to generate Figure 33. The objective of the GA optimization is to make all values of γ positive and large while minimizing the amount of control effort it takes to do so. Figure 33: Stability Table from Ngamsom (2001) Figure 34 and Figure 35 display the states and control signal for the linear LARC on the artificial system (3.3) for the initial condition (10,20). The states converge shortly after 0.2s while max u reaches approximately 2700. 40 Figure 34: States of System (3.3) Using LLARC Figure 35: Control Signal of LLARC on System (3.3) Figure 36 displays the level sets of the CLF for the nonlinear LARC and the state trajectory of the artificial system (3.3) for the initial condition (10,20). The state 41 trajectory is forced to cross successively lower levels of the CLF and yields very sudden discrete “switchlike” direction changes. Such behavior is due toV& having a strict equality to a function of the states ( ( )2 ( )( )2 V& = xT Pf + xTQx xT Pg ). Viewing Figure 37, we see that the rate of convergence is positive everywhere by design of the control law. The peaks in the γ plot correspond to the fastest rates of change and they are evident in the state transition plot of Figure 38. The control law forces γ to be strictly positive so that the part of the trajectory in Figure 31 under the linear LARC that crosses into higher levels of the CLF is now steered inward toward successively lower levels of the CLF. Figure 38 and Figure 39 display the states and control signal for the (10,20) initial condition trajectory. The states do not converge much faster in the nonlinear LARC case than with the linear LARC case, however the maximum magnitude of 2 x is smaller for the nonlinear LARC because the trajectory is forced to stay inside the level set corresponding to the initial condition. The control effort is of course much larger for the nonlinear LARC. Figure 39 shows the cost of the renewed stability by using the nonlinear controller instead of the linear controller – excessively large control effort. The need for reorienting the CLF level sets to help decrease the large control effort is made evident in this example. 42 Figure 36: Level Sets of V for NLARC on System (3.3) Figure 37: “Gamma” (γ ) for NLARC on System (3.3) 43 Figure 38: States of System (3.3) Using NLARC Figure 39: Control Signal of NLARC on System (3.3) The LARC method does not directly use information about the rate of convergence nor information about the control effort of the nonlinear control system. In 44 fact, P is not explicitly treated as the matrix for the quadratic CLF,V = xT Px , but it may be viewed as such. The following question is the primary motivation for this work. Question 3.1 Can we achieve better control performance over that of Ngamsom (2001) for the nonlinear system in terms of minimum rate of convergence and maximum control effort in X 0 by using the CLF, V = xT Px , and directly checking these performance measures at all points on 0 d X to maximize the performance index defined by the fitness function (2.13)? We now attempt to answer Question 3.1 by applying the proposed GAC method to the example systems. 3.2.1 GAC Case #1: Linear GAC The genetic algorithm parameters for case #1 and the resulting GAC performance are given Table 35. This case implements linear control. The GA parameters are described in Chapter 2 and in the appendix where they are used in the Matlab code that implements the GA optimization procedure. System: Artificial (3.3) Controller Type: linear Generations: 50 Population Size: 50 Checking points: 100 Critical Points: 33 E Δ : 100 max K : 1000 c: 0 X0 Range: [ 25 25] 1 x ∈ − [ 100 100] 2 x ∈ − Table 35: GA Parameters to Optimize LGAC for System (3.3) The GA results of the optimization of the linear GAC for the artificial system (3.3) are listed in Table 36. The W vector contains the weighting factors used in the fitness function defined in (2.13) to vary the weight between rate of convergence and 45 control effort. The results in Table 36 include the three sets of weights W = [1 0], W = [1 1], and W = [0 1]. They represent 3 cases where the GA selects population members with a large min γ and ignores max u , large min γ and small max u , and lastly a small max u while ignoring min γ , respectively. The reader is reminded that ultimately, the GA is tuning Q to yield a favorable set of P and K. Therefore comparing GAC to LARC begins with comparing their respective Q matrices. As can be seen the Q’s of the linear GAC are very different than the Q’s of the linear LARC. An obvious difference is the nondiagonal terms in the set of Q’s resulting from the GA optimization. Weight Case Weights P K Q 1 W=[1 0] 2768.8 347.3 347.3 71.8 868.23 179.38 6.9845x105 1.1141x105 1.1141x105 20450 2 W=[1 1] 2993.0 352.2 352.2 62.1 880.42 155.16 715290 102800 102800 16910 3 W=[0 1] 3.9147 3.5413 3.5413 3.5308 8.8532 8.8269 0.0854 0.0452 0.0452 0.0267 Table 36 Optimized Parameters of LGAC for System (3.3) Table 37 lists the randomly sampled performance of the GAC for the 3 weight cases. As one might expect, min γ decreases and max u increases as the emphasis on minimum rate of convergence shifts to an emphasis on maximum control effort. Unfortunately, this is not always the case as we shall see in the later examples. The first two cases in Table 37 yielded a much better performance than the LARC on X 0 for the minimum rate of convergence, but not for maximum control effort. Only the third case yielded a better value for maximum control effort, however the third case yielded an unstable system and therefore is not a fair comparison to the LARC control effort. Except for the last case, the controller gains are much larger for the GAC than the LARC. The design objective of achieving a simultaneous better rate of convergence and better 46 maximum control effort over the LARC is not met. Though not explored here, it is posited that a weight case that is somewhere between cases 2 and 3 may result in a controller that achieves the design objective or at least yield a controller closer to the design objective. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 34.564 301.74 208.37 39514 13381 9303.7 2 W=[1 1] 29.737 269.47 174.48 37392 12829 8786.4 3 W=[0 1] 319.66 6.3853 76.592 1099.2 449.21 270.73 Table 37: Performance of LGAC on System (3.3) Table 38 lists the randomly sampled performance of the GA controller on the linearized system (3.4). All minimum rates of convergence are positive, as expected for the linear system, because the linear GAC is a linear quadratic regulator. From Table 38 we observe that the local rate of convergence is faster, indicating the system nonlinearities work against the controller’s effort to stabilize the system. Weights min γ γ μ γ σ max u u μ u σ W=[1 0] 76.5938 522.89 230.55 3933.4 13250 9287.8 W=[1 1] 92.0664 470.25 174.51 3730.0 12781 8801.2 W=[0 1] 0.05950 20.733 3.6937 1096.7 449.46 270.17 Table 38: Performance of LGAC on Linearized System (3.4) Figure 310, Figure 311, and Figure 312 contain plots of the level sets of the CLF with the state trajectory for initial condition (10,20), the rate of convergenceγ , and the locations of the estimated critical points for all controllers in the GA population. We see that γ > 0 on all of X 0 , i.e. there are no “holes” in the γ plot unlike that of the LARC. Many critical points exist in the top left side of Figure 312 where the value for γ is low (see Figure 311) for many members of the GA population. Had we not used the 47 critical point estimation and just selected a fixed or random grid as the checking set 0 d X , many of these points would have been missed and much of the processing time would have been spent checking irrelevant points with a highγ value or a low max u value. The number of critical points was set to 33. The placement of these points by the critical point search algorithm (Chapter 2 – “Making 0 d X Dynamic”) is displayed in Figure 312. The total number of points is 100; three times the number of critical points. The critical point searching algorithm clustered the majority of the critical points in the top left corner. It is reasonable to assume the spacing of these points is close to the spacing required to satisfy Theorem 1.1. If this is the case, then it would require many more points than the 100 search points used in 0 d X to satisfy the spacing requirements of Theorem 1.1, hence the search procedure would be much slower. The heuristic formulated in Chapter 2 for moving the search points around to find the critical points seems to be effective. Figure 310: Level Sets of V for LGAC on System (3.3) (W=[1 0]) 48 Figure 311: “Gamma” (γ ) for LGAC on System (3.3) (W=[1 0]) Figure 312: Critical Points for LGAC Population Using System (3.3) (W=[1 0]) It is apparent from Figure 38, Figure 39, Figure 313 and Figure 314 that the time of convergence is much faster for the linear GAC with W=[1 0] than the linear 49 LARC, but the control effort is higher. Such a result is acceptable because no weight was placed on the control effort during the optimization. Another notable observation is how the trajectory of Figure 310 uniformly crosses into successively lower level sets, even though the control law is linear and does not possess the restrictive effect on the state trajectories as the nonlinear controller. Figure 315 through Figure 319 are very similar to Figure 310 through Figure 314. This is expected given that the optimization yielded very similar sets of control gains. The control effort is slightly smaller because the control effort has an equal weight to the rate of convergence rather than zero in the previous case. It is interesting how sensitive the rate of convergence and control effort are to the weights. Figure 320 through Figure 324 are the results of not considering the rate of convergence during the optimization (i.e. W=[0 1]). For W=[0 1], a very low control effort is achieved but for much of X 0 we haveγ < 0 . Because we have open level sets, even if γ > 0 , the trajectory can exit X 0 into the region of state space where the performance has not been checked and no stability guarantees can be made. Although more research must be performed, it is reasonable to assume that some weight vector between W=[1 1] and W=[0 1] (setting w1 between 0 and 1) might yield a stable controller with a smaller value for max u than the linear LARC. Such a nonlinear dependence on W for the performance measures makes tuning W a nontrivial task. 50 Figure 313: States of System (3.3) Using LGAC (W=[1 0]) Figure 314: Control Signal of LGAC on System (3.3) (W=[1 0]) 51 Figure 315: Level Sets of V for LGAC on System (3.3) (W=[1 1]) Figure 316: “Gamma” (γ ) for LGAC on System (3.3) (W=[1 1]) 52 Figure 317: Critical Points for LGAC Population Using System (3.3) (W=[1 1]) Figure 318: States of System (3.3) Using LGAC (W=[1 1]) 53 Figure 319: Control Signal Profile of LGAC on System (3.3) (W=[1 1]) Figure 320: Level Sets of V for LGAC on System (3.3) (W=[0 1]) 54 Figure 321: “Gamma” (γ ) for LGAC on System (3.3) (W=[0 1]). Figure 322: Critical Points for LGAC Population Using System (3.3) (W=[0 1]) 55 Figure 323: States of System (3.3) Using LGAC (W=[0 1]) Figure 324: Control Signal of LGAC on System (3.3) (W=[0 1]) 56 Summarizing the comparison between the linear LARC and the linear GAC on the artificial system (3.3), we make the following observations. The linear GAC yields a strictly positive γ on X 0 for the two instances that the rate of convergence has an effect on the fitness function (W=[1 0] & W=[1 1]). The minimum rate of convergence is higher for the linear GAC for the weight settings W=[1 0] and W=[1 1], but the maximum control effort is also much higher. The linear GAC for the weight setting W=[0 1] yields a much lower maximum control effort than the linear LARC but the controller is unstable which raises the question: With the correct setting of W can both min γ and max u be improved on X 0 ? Although further research would be needed, viewing the trend of min γ and max u with the settings of W in Table 37, it is reasonable to assume such a weight setting exists. 3.2.2 GAC Case #2: Nonlinear GAC We now consider nonlinear control of the artificial system (3.3). The genetic algorithm parameters for this case are given in Table 39. System: Artificial (3.3) Controller Type: nonlinear Generations: 50 Population Size: 50 Checking points: 100 Critical Points: 33 E Δ : 100 max K : 1000 c: 0 X0 Range: [ 25 25] 1 x ∈ − [ 100 100] 2 x ∈ − Table 39: GA Parameters to Optimize NGAC for System (3.3) Table 310 lists the results of the optimization of the nonlinear GAC for the artificial system (3.3) using the parameters from Table 39. As with the case of the linear GAC, the GA found nondiagonal solutions for Q in all three cases. 57 Weight Case Weights P K Q 1 W=[1 0] 292.952 239.153 239.153 195.615 597.88 489.04 3.5160x105 2.8683x105 2.8683x105 2.3398x105 2 W=[1 1] 1664.4 348.60 348.60 77.100 871.55 192.71 7.2631x105 1.3836x105 1.3836x105 0.2661x105 3 W=[0 1] 3.9109 3.5506 3.5506 3.5455 8.8766 8.8638 0.5750 0.5136 0.5136 0.4625 Table 310: Optimized Parameters of NGAC for System (3.3) Table 311 displays the randomly sampled performance of the nonlinear GAC on the artificial system (3.3). Here we see that the rates of convergence are positive even for the case where the rate of convergence is not a factor in the fitness function (W=[0 1]). The control effort is, however, much higher for the nonlinear GAC than the linear GAC, similar to the comparison between the LARC and the nonlinear controller using P from the LARC optimization. A much better min γ is achieved along with a much better max u using the nonlinear GAC, but the nonlinear GAC has a higher average control effort than the nonlinear LARC. If such a feature is undesirable to the control engineer, the GA fitness function may be easily augmented with a term that favors low average control effort, although more points in 0 d X would be required to get a realistic estimate of the average control effort. Weights min γ γ μ γ σ max u u μ u σ W=[1 0] 12.685 1446.3 679.85 68306 25642 15525 W=[1 1] 7.8767 476.99 251.86 51756 14837 10615 W=[0 1] 0.0384 53.520 57.022 62062 1394.0 2103.7 Table 311: Performance of NGAC on System (3.3) Table 312 presents the randomly sampled performance of the nonlinear GAC on the linearized artificial system (3.4). In all three cases, the local performance substantially differs from that across the entire set X 0 . 58 Weights min γ γ μ γ σ max u u μ u σ W=[1 0] 2.3226 2394.2 169.28 6362.4 25038 15347 W=[1 1] 45.683 786.90 205.97 40986 13668 9612.2 W=[0 1] 0.0287 20.980 3.6402 1102.7 452.00 269.85 Table 312: Performance of NGAC on Linearized System (3.4) Figure 325, Figure 326, and Figure 327 contain plots of the CLF with the state trajectory for the initial condition (10,20), the rate of convergenceγ , and the locations of the estimated critical points for all controllers in the GA population. The GA yielded open level sets on X 0 . The sampled performance yielded 2.3226 min γ = , however the true value is clearly negative given the unstable trajectory. This is a problem that occurs when the ratio λ (P) /λ (P) is small (e.g. λ (P) /λ (P)= 4.68x104 for the W=[1 0] case). The level sets are nearly open andγ is actually slightly negative at some points along the line that the trajectory follows to exit X 0 in Figure 325. Trajectories originating in this region are pulled into the crevice of negative γ values and follow it to the outside of X 0 . The points along this line (“escape manifold”) are hard to find with a small checking set, causing the GA to use a bad estimate of the true value of min γ when evaluating the controllers. One solution is to simply increase the number of checking points in 0 d X . The next GA optimization run (“Case #3”) in the discussion is based on this approach. The ultimate solution, recommended for future research, is to guarantee X 0 contains only closed level sets. That is, guarantee that V (x) =V ( y), ∀x, y∈∂X 0 . 59 Figure 325: Level Sets of V for NGAC on System (3.3) (W=[1 0]) Figure 326: “Gamma” (γ ) for NGAC on System (3.3) (W=[1 0]) 60 Figure 327: Critical Points for NGAC Population Using System (3.3) (W=[1 0]) Figure 328: States of System (3.3) Using NGAC (W=[1 0]) 61 Figure 329: Control Signal of NGAC on System (3.3) (W=[1 0]) The case with W=[1 1] (Figure 330 through Figure 334) yields a stable trajectory. In fact, the nonlinear GAC with W=[1 1] yields a trajectory that converges about twice as fast with about half the maximum control effort than the nonlinear LARC. The design objective is therefore carried despite the problem of not using a large enough checking set 0 d X . 62 Figure 330: Level Sets of V for NGAC on System (3.3) (W=[1 1]) Figure 331: “Gamma” (γ ) for NGAC on System (3.3) Using W=[1 1] 63 Critcal Points x2 x1 Figure 332: Critical Points for NGAC Population Using System (3.3) (W=[1 1]) Figure 333: States of System (3.3) Using NGAC (W=[1 1]) 64 Figure 334: Control Signal of NGAC on System (3.3) (W=[1 1]) Weight case 3 suffers from the same problem as weight case 1 (Figure 335 through Figure 339). However, using more search points or guaranteeing V (x) =V ( y), ∀x, y∈∂X 0 would not necessarily guarantee a stable controller since min γ has zero weight in the fitness function. Case 3 is used to show the GA’s ability to minimize the maximum control effort of the given CLF and controller topologies. 65 Figure 335: Level Sets of V for NGAC on System (3.3) (W=[0 1]) Figure 336: “Gamma” (γ ) for NGAC on Artificial System (3.3) (W=[0 1]) 66 Figure 337: Critical Points for NGAC Population Using System (3.3) (W=[0 1]) Figure 338: States of System (3.3) Using NGAC (W=[0 1]) 67 Figure 339: Control Signal of NGAC on System (3.3) (W=[0 1]) 3.2.3 GAC Case #3: Denser Checking Sets The genetic algorithm parameters for Case #3 are given in Table 313. We use the same GA parameters as Case #2 with nonlinear control, but the number of checking points in 0 d X has been doubled. The intention is to catch the numerically illcondition “crevice” in the γ plots of Case #2 and tune P such that the λ (P) /λ (P) ratio is not too small, yet is not restricted by some predefined lower threshold. System: Artificial (3.3) Controller Type: nonlinear Generations: 20 Population Size: 50 Checking points: 200 Critical Points: 100 E Δ : 100 max K : 1000 c: 0 X0 Range: [ 25 25] 1 x ∈ − [ 100 100] 2 x ∈ − Table 313: GA Parameters to Optimize NGAC for System (3.3) (High Density 0 d X ) 68 Table 314 lists the results of the optimization of the nonlinear GAC for the artificial system (3.3) using the parameters from Table 313. Comparing Table 310 to Table 314 the doubling of the size of 0 d X seems to have substantially affected the GA’s output for the Weight Case 1. Weight Case Weights P K Q 1 W=[1 0] 4976.2 339.00 339.00 36.100 847.46 90.348 5.8939x105 19600 19600 1060 2 W=[1 1] 4287.3 376.90 376.90 45.800 942.37 114.517 8.0232x105 60900 60900 5480 3 W=[0 1] 3.8772 3.5242 3.5242 3.5240 8.8106 8.8100 0.0829 0.0751 0.0751 0.0688 Table 314: Optimized Parameters of NGAC for System (3.3) (High Density 0 d X ) Table 315 displays the randomly sampled performance of the nonlinear GAC on the artificial system (3.3) using double the number of elements in 0 d X . Table 316 displays the randomly sampled performance of the nonlinear GAC on the linearized artificial system (3.4) using double the number of elements in 0 d X . The change in the performance of the nonlinear GAC for W=[1 0] between Case #2 and Case #3 is understandable given the substantial change in P and K. The change in min γ of the nonlinear GAC for W=[0 1] between Case #2 and Case #3 was not expected because P and K are very similar between the two cases. A likely cause of the discrepancy is the smallλ (P) /λ (P)= 0.0238 ratio where the true value of min γ lies on a thin “escape manifold” as discussed in Case #2. 69 Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 32.628 174.013 62.836 87234 14322 12010 2 W=[1 1] 33.280 234.056 101.40 73692 14871 11249 3 W=[0 1] 0.3079 53.8784 57.666 57181 1399.1 2042.3 Table 315: Performance of NGAC on System (3.3) (High Density 0 d X ) Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 151.578 254.15 31.559 30036 11165 7067.5 2 W=[1 1] 128.115 361.94 75.626 34929 12711 8190.3 3 W=[0 1] 0.00120 20.715 3.6789 1090.8 447.79 268.42 Table 316: Performance of NGAC on Linearized System (3.4) (High Density 0 d X ) For Weight Cases 1 and 2, Figure 340 & Figure 345 display closed V contours, showing stable trajectories and well conditionedγ ’s (Figure 341 & Figure 346). Figure 351 still possesses the negativeγ as in Case #2, however this is expected because the GA search ignores γ altogether. Hence, it is advisable to put some small weight on γ when searching for controllers that yield low max u . Comparing the state and control response of the nonlinear LARC in Figure 38 & Figure 39 to those of the nonlinear GAC with weight settings W=[1 0] and W=[1 1] in Figure 343, Figure 344, Figure 348, & Figure 349, we see that the nonlinear GAC converges about 4 times faster with about 20% less maximum control effort. 70 Figure 340: Level Sets of V for NGAC on System (3.3) (W=[1 0], High Density 0 d X ) Figure 341: “Gamma” (γ ) for NGAC on System (3.3) (W=[1 0], High Density 0 d X ) 71 Figure 342: Critical Points for NGAC Using System (3.3) (W=[1 0], High Density 0 d X ) Figure 343: States of System (3.3) Using NGAC (W=[1 0], High Density 0 d X ) 72 Figure 344: Control Signal of NGAC on System (3.3) (W=[1 0], High Density 0 d X ) Figure 345: Level Sets of V for NGAC on System (3.3) (W=[1 1], High Density 0 d X ) 73 Figure 346: “Gamma” (γ ) for NGAC on System (3.3) (W=[1 1], High Density 0 d X ) Figure 347: Critical Points for NGAC Using System (3.3) (W=[1 1], High Density 0 d X ) 74 Figure 348: States of Artificial System (3.3) Using NGAC (W=[1 1], High Density 0 d X ) Figure 349: Control Signal of NGAC on System (3.3) (W=[1 1], High Density 0 d X ) 75 Figure 350: Level Sets of V for NGAC on System (3.3) (W=[0 1], High Density 0 d X ) Figure 351: “Gamma” (γ ) for NGAC on System (3.3) (W=[0 1], High Density 0 d X ) 76 Figure 352: Critical Points for NGAC Using System (3.3) (W=[0 1], High Density 0 d X ) Figure 353: States of System (3.3) Using NGAC (W=[0 1], High Density 0 d X ) 77 Figure 354: Control Signal of NGAC on System (3.3) (W=[0 1], High Density 0 d X ) In conclusion, we have seen in this example that while the LARC, by design, has a larger region of stability, the GAC can yield a better performance in terms of minimum rate of convergence and maximum control effort. The GAC also offers the ability to select the region of state space to perform the optimization. To answer question 3.1, checking the performance of the nonlinear system directly on 0 d X using the CLF,V = xT Px , does offer a clear advantage over the robust linear control theory based procedure of Ngamsom’s LARC. It is important, however, to set both elements of W to a positive number and use a sufficient number of elements in 0 d X (found via experimentation) to yield closed level sets for the CLF (λ (P) /λ (P)>>0). 78 3.3 Example 2: Double Inverted Pendulum System The system in Figure 355 was taken originally from Misawa (1995) where it is described in detail. Figure 355: Double Inverted Pendulum System (Misawa et al, 1995) The equations of motion are x& = f (x)+ g(x)u (3.5) where x∈ℜ4 , f , g :ℜ4 →ℜ4 , u ∈ℜ, and ( ( ) ( )) ( ) ( ) ( ) ( ) ( ( ) ( )) ( ) ( ) ( ) ( ) 1 2 2 3 4 2 2 1 2 3 4 2 1 2 2 3 1 2 4 4 1 2 2 4 1 2 3 1 2 4 3 4 2 1 2 2 1 2 3 3 2 4 1 3 5.9809 cos 5.3371sin 1.5071 1.5071 257.6614sin 0.8774 sin 0.2824 191.2383sin cos 5.9809 cos 0.9833 1.1206sin 0.3165 214.3082sin sin 0.2824 0.2824 48.2776sin cos x x x x x x x x x x x x x x x x f x x x x x x x x x x x x x x x x f f x f x − + − ⎪⎭ ⎪⎬ ⎫ ⎪⎩ ⎪⎨ ⎧ − − − + − − − − + + − = − + − ⎪⎭ ⎪⎬ ⎫ ⎪⎩ ⎪⎨ ⎧ + + − − − − + − + − = = = gravity x1 x2 u 79 ( ) ( ) ( 1 2 ) 2 1 2 4 1 2 3 2 2 1 5.9809 cos 504.2688cos 5.9809 cos 565.1008 0 0 x x g x x x x g g g − + − − = − + − − = = = The linearized dynamics about the upright position are: x& = Ax + Bu , with ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = 0 13.3356 0.4787  0.3593 43.0260  9.6925  0.2541 0.1202 0 0 0 1 0 0 1 0 A (3.6) ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − = 101.2405 113.4536 0 0 B 3.3.1 LARC Results Two ranges of the checking set 0 d X were used for all controllers in this example. The small range, x ∈[− 0.1 0.1],i =1,2,3,4 i , is used to capture the local slightly nonlinear behavior about the origin. The large range, x ∈[−1 1],i =1,2,3,4 i , is used to capture the highly nonlinear behavior away from the origin. Table 317 displays the LARC results for the double inverted pendulum system as reported in Ngamsom (2001). P K Q 0.3882 0.9764 0.1241 0.1412 0.9764 9.2923 1.0084 1.2328 0.1241 1.0084 0.1211 0.1429 0.1412 1.2328 0.1429 0.1759 0.1036 5.2008 0.3618 0.8021 2 0 0 0 0 2 0 0 0 0 2 0 0 0 0 2 Table 317: LLARC Parameters for System (3.5) (Ngamsom, 2001) 80 Table 318 lists the randomly sampled performance of the linear LARC on the double inverted pendulum system (3.5). For the small range, min γ is negative (although Ngamsom reports the system to be stable). It turns out that the system is unstable with respect to the particular quadratic CLF dictated by P in Table 317 and not necessarily quadratically unstable. For the large range, Ngamsom (2001) reports the system to be unstable. The min γ value found here is highly negative supporting Ngamsom’s report. X0 Range min γ γ μ γ σ max u u μ u σ [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi 2.6472 0.0127 1.5402 0.6262 0.2628 0.1543 [ ] 1,2,3,4 1 1 = ∈ − i xi 166.31 30.482 36.980 6.4308 2.6432 1.5451 Table 318: Performance of LLARC on System (3.5) Table 319 shows the randomly sampled performance of the nonlinear LARC on the double inverted pendulum system (3.5). Because the linear LARC performs poorly for the large range, we may attest that the system nonlinearities cause the nonlinear controller to use high control effort to keep the state trajectory within the level sets whose “angle” and “skewing” (if one can visualize in 4D!) are optimized for the locally linear behavior of the system. X0 Range min γ γ μ γ σ max u u μ u σ [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi 0.4378 10.837 4.3018 1.4729 0.5487 0.3318 [ ] 1,2,3,4 1 1 = ∈ − i xi 0.0914 11.313 4.5151 17982 12741 174.81 Table 319: Performance of NLARC on System (3.5) Table 320 lists the randomly sampled performance of the linear LARC on the linearized double inverted pendulum system (3.6). The rate of convergence is negative 81 for the linear LARC on the nonlinear dynamics and relatively slow for the linear LARC on the linear dynamics which suggests that significant nonlinearities will destabilize the system. X0 Range min γ γ μ γ σ max u u μ u σ [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi 0.0208 0.4217 1.4648 0.6187 0.2634 0.1546 [ ] 1,2,3,4 1 1 = ∈ − i xi 0.0207 0.4017 1.2807 6.2708 2.6458 1.5476 Table 320: Performance of LLARC on Linearized System (3.6) Figure 356 & Figure 357 contain the system signal responses to the initial condition (0.5,0,0,0) for the linear LARC on the double inverted pendulum system (3.5). Figure 358 & Figure 359 contain the system signal responses to the initial condition (0.5,0,0,0) for the nonlinear LARC on the double inverted pendulum system (3.5). The nonlinear LARC response is interesting with the chattering of the control signal (Figure 359) hence the chattering of the state variables. Such is the nature of the nonlinear controller combined with a finite time stepping numerical simulation method. The infamous denominator term of the Sontaglike control law, xT Pg , gets very close to zero and rapidly switches sign during the first second of the simulation. With progressively smaller time stepping, the switching is less severe, however the smallness of the time step becomes impractically small to produce a reasonably long enough simulation. The problem reflects an actual control law physical implementation issue: discretetime sampling. The digital implementation of the control law will have the same problem that most likely will be even worse given the limitations of sampling rates in realtime digital control systems. 82 Figure 356: States of System (3.5) Using LLARC Figure 357: Control Signal of System (3.5) Using LLARC 83 Figure 358: States of System (3.5) Using NLARC Figure 359: Control Signal of System (3.5) Using NLARC 84 3.3.2 Linear GAC with Small Checking Set Range Table 321 lists the genetic algorithm parameters used to optimize the linear GAC for the double inverted pendulum system (3.5) using a small 0 d X range. The checking set range is set low to minimize the effects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: linear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi Table 321: GA Parameters to Optimize LGAC for System (3.5) (Small 0 d X Range) Table 322 shows the genetic algorithm results of the optimization of the linear GAC for the double inverted pendulum system (3.5) using a small 0 d X range. It is interesting that the optimal Q has a structure such that the first diagonal element is dominate and all other elements are almost negligible besides the terms involving the state 1 x . It is also interesting that some terms are negative which suggests that an optimal controller will ensure that the states that are coupled to these terms are opposite in sign during most of the state trajectory. This type of solution is not obvious so would probably not have been selected by a human designer, yet it becomes a useful insight for the controller design. 85 Weight Case Weights P K Q 1 W=[1 0] 0.3010 0.6116 0.0781 0.0914 0.6116 2.0038 0.2208 0.2984 0.0781 0.2208 0.0258 0.0330 0.0914 0.2984 0.0330 0.0445 0.3922 5.1569 0.4182 0.7570 0.4513 0.0053 0.0021 0.0041 0.0053 0.0001 0.0000 0.0000 0.0021 0.0000 0.0001 0.0001 0.0041 0.0000 0.0001 0.0002 2 W=[1 1] 0.1179 0.2750 0.0367 0.0418 0.2750 1.2195 0.1267 0.1819 0.0367 0.1267 0.0144 0.0190 0.0418 0.1819 0.0190 0.0272 0.0676 4.0455 0.2904 0.5908 0.0559 0.0019 0.0010 0.0004 0.0019 0.0001 0.0000 0.0001 0.0010 0.0000 0.0000 0.0000 0.0004 0.0001 0.0000 0.0001 3 W=[0 1] 0.1300 0.2955 0.0392 0.0448 0.2955 1.2698 0.1327 0.1895 0.0392 0.1327 0.0151 0.0199 0.0448 0.1895 0.0199 0.0283 0.0901 4.1270 0.2997 0.6029 0.0765 0.0021 0.0007 0.0010 0.0021 0.0001 0.0000 0.0000 0.0007 0.0000 0.0000 0.0000 0.0010 0.0000 0.0000 0.0000 Table 322: Optimized Parameters of LGAC for System (3.5) (Small 0 d X Range) Table 323 displays the randomly sampled performance of the linear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. Unlike the linear LARC (Table 317), the linear GAC yields a positive minimum rate of convergence with a much higher average rate of convergence and without a significant difference between control effort or gains for all sets of fitness function weights. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.2212 12.6432 6.5883 0.6455 0.2597 0.1526 2 W=[1 1] 0.0568 12.6826 6.5130 0.4851 0.2049 0.1192 3 W=[0 1] 0.0739 12.5958 6.4883 0.5028 0.2090 0.1229 Table 3.22: Performance of LGAC on System (3.5) (Small 0 d X Range) Table 323 displays the randomly sampled performance of the linear GAC on the linearized double inverted pendulum system (3.6) using a small 0 d X range. The rate of convergence is fairly close to that of Table 322 which suggests that the nonlinearities do not play as significant a role locally as they do with the linear LARC where the difference in min γ between the nonlinear and linear dynamics is substantial. 86 Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1113 13.1476 6.9741 0.6540 0.2605 0.1539 2 W=[1 1] 0.0799 13.0182 6.8206 0.4877 0.2048 0.1198 3 W=[0 1] 0.0168 13.0161 6.8388 0.5043 0.2068 0.1218 Table 323: Performance of LGAC on Linearized System (3.6) (Small 0 d X Range) Figure 360 through Figure 365 contain the system signal responses to the initial condition (0.5,0,0,0) for the linear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. The weight setting W=[1 0] yields a much faster response but at the cost of more control effort than the linear LARC. The weight setting W=[1 1] does not yield a noticeably different response speed than the linear LARC, but the control effort is higher. The weight setting W=[0 1] yields a much faster response and with slightly more control effort than the linear LARC. The linear GAC for W=[0 1] would probably be selected over the linear LARC. 87 Figure 360: States of System (3.5) Using LGAC (W=[1 0], Small 0 d X Range) Figure 361: Control Signal of System (3.5) Using LGAC (W=[1 0], Small 0 d X Range) 88 Figure 362: States of System (3.5) Using LGAC (W=[1 1], Small 0 d X Range) Figure 363: Control Signal of System (3.5) Using LGAC (W=[1 1], Small 0 d X Range) 89 Figure 364: States of System (3.5) Using LGAC (W=[0 1], Small 0 d X Range) Figure 365: Control Signal of System (3.5) Using LGAC (W=[0 1], Small 0 d X Range) 90 3.3.3 Linear GAC with Large Checking Set Range Table 324 shows the genetic algorithm parameters used to optimize the linear GAC for the double inverted pendulum system (3.5) using a large 0 d X range. The checking set range is set high to demonstrate the affects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: linear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 1 1 = ∈ − i xi Table 324: GA Parameters to Optimize LGAC for System (3.5) (Large 0 d X Range) Table 325 displays the GA results of the optimization of the linear GAC for the double inverted pendulum system using a large X 0 range. The results are somewhat similar to those of the linear GAC run with a small X 0 range indicating low sensitivity to the X 0 range. Weight Case Weights P K Q 1 W=[1 0] 0.1009 0.2370 0.0322 0.0363 0.2370 1.1231 0.1154 0.1678 0.0322 0.1154 0.0131 0.0174 0.0363 0.1678 0.0174 0.0251 0.0237 3.8883 0.2727 0.5673 0.0186 0.0019 0.0002 0.0008 0.0019 0.0002 0.0000 0.0000  0.0002 0.0000 0.0000 0.0001 0.0008 0.0000 0.0001 0.0002 2 W=[1 1] 0.1724 0.3756 0.0496 0.0574 0.3756 1.4711 0.1574 0.2202 0.0496 0.1574 0.0182 0.0237 0.0574 0.2202 0.0237 0.0330 0.1804 4.4422 0.3357 0.6500 0.1694 0.0010 0.0005 0.0025 0.0010 0.0000 0.0000 0.0000 0.0005 0.0000 0.0000 0.0000 0.0025 0.0000 0.0000 0.0001 3 W=[0 1] 0.1581 0.3498 0.0458 0.0528 0.3498 1.4030 0.1484 0.2091 0.0458 0.1484 0.0170 0.0223 0.0528 0.2091 0.0223 0.0312 0.1486 4.3311 0.3230 0.6336 0.1348 0.0014 0.0011 0.0017 0.0014 0.0001 0.0000 0.0001 0.0011 0.0000 0.0000 0.0000 0.0017 0.0001 0.0000 0.0002 Table 325: Optimized Parameters of LGAC for System (3.5) (Large 0 d X Range) 91 Table 326 lists the randomly sampled performance of the linear GAC on the double inverted pendulum system (3.5) using a large 0 d X range. Weight case 1 yielded a faster rate of convergence than the LARC, but a larger maximum control effort. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1073 9.7520 3.4343 45061 11.004 332.63 2 W=[1 1] 158.19 17.641 35.302 5.4339 2.2438 1.3166 3 W=[0 1] 148.99 16.986 34.249 5.2611 2.1870 1.2875 Table 326: Performance of LGAC on System (3.5) (Large 0 d X Range) Table 327 displays the randomly sampled performance of the linear GAC on the linearized double inverted pendulum system (3.6) using a large 0 d X range. Notice min γ is negative for W=[0 1]. This is because the minimum eigenvalue for P is slightly negative (λ (P) = −1.4331x10−5 ) due to Q having an eigenvalue close to zero. The linear system is stable, with the closed loop poles for the linearized dynamics all in the left half plane closed loop poles = {− 7.2235 ± 2.0095i,−7.7300,−5.9369}). However it is unstable with respect to the CLF. Though it is expected when not checking the rate of convergence, this sort of problem is solved by setting c>0 in Table 324 such that the smallest eigenvalue of Q is equal to c. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.0604 13.024 6.7999 4.6734 1.9640 1.1540 2 W=[1 1] 0.0235 13.001 6.7290 5.3853 2.2410 1.3147 3 W=[0 1] 0.7703 12.994 6.8051 5.2696 2.1968 1.2867 Table 327: Performance of LGAC on Linearized System (3.6) (Large 0 d X Range) Figure 366 through Figure 371 contain the system signal responses to the initial condition (0.5,0,0,0) for the linear GAC on the double inverted pendulum system (3.5) 92 using a large 0 d X range. The controller gains of this set of controllers are similar, hence the dynamic responses are similar; none of them being an improvement over the linear LARC. It is suspected that the number of checking points used is insufficient for the large 0 d X range. The same number of points was used in this case as with the smaller 0 d X range, meaning the estimates of min γ and max u were better for the small 0 d X range, making the controllers better. The large range is 10 times larger than the small range. Given a 4th order system, to duplicate the checking set density of the small range for the large range, the large range checking set size would have to be increased by a factor of 10,000! We now see the major drawback of direct pointwise CLF checking of performance. The random search of the critical points relaxes the checking set size requirements, however the sufficient number of checking points still becomes prohibitively large for higher dimensional systems. Future research should focus on reducing the scale rate of the required checking set size with the system dimension. 93 Figure 366: States of System (3.5) Using LGAC (W=[1 0], Large 0 d X Range) Figure 367: Control Signal of System (3.5) Using LGAC (W=[1 0], Large 0 d X Range) 94 Figure 368: States of System (3.5) Using LGAC (W=[1 1], Large 0 d X Range) Figure 369: Control Signal of System (3.5) Using LGAC (W=[1 1], Large 0 d X Range) 95 Figure 370: States of System (3.5) Using L GAC (W=[0 1], Large 0 d X Range) Figure 371: Control Signal of System (3.5) Using LGAC (W=[0 1], Large 0 d X Range) 96 3.3.4 Nonlinear GAC with Small Checking Set Range Table 328 shows the genetic algorithm parameters used to optimize the nonlinear GAC for the double inverted pendulum system (3.5) using a small 0 d X range. The checking set range is set small to minimize the effects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: nonlinear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 0.1 0.1 = ∈ − i xi Table 328: GA Parameters to Optimize NGAC for System (3.5) (Small 0 d X Range) Table 329 displays the GA results of the optimization of the nonlinear GAC for the double inverted pendulum system (3.5) using a small X 0 range. The results are similar to those of the linear GAC run. Weight Case Weights P K Q 1 W=[1 0] 0.1536 0.3442 0.0453 0.0522 0.3442 1.3934 0.1474 0.2079 0.0453 0.1474 0.0169 0.022 0.0522 0.2079 0.0222 0.0311 0.1442 4.3183 0.3216 0.6321 0.1301 0.0019 0.0011 0.0000 0.0019 0.0000 0.0000 0.0000 0.0011 0.0000 0.0001 0.0002 0.0000 0.0000 0.0002 0.0006 2 W=[1 1] 0.1524 0.3407 0.0449 0.0518 0.3407 1.3835 0.1465 0.2067 0.0449 0.1465 0.0168 0.0220 0.0518 0.2067 0.0220 0.0309 0.1420 4.3067 0.3202 0.6298 0.1279 0.0021 0.0011 0.0007 0.0021 0.0002 0.0000 0.0001 0.0011 0.0000 0.0001 0.0001 0.0007 0.0001 0.0001 0.0001 3 W=[0 1] 0.1665 0.3652 0.0480 0.0554 0.3652 1.4418 0.1535 0.2154 0.0480 0.1535 0.0177 0.0231 0.0554 0.2154 0.0231 0.0322 0.1666 4.3942 0.3302 0.6428 0.1541 0.0022 0.0005 0.0002 0.0022 0.0001 0.0000 0.0000 0.0005 0.0000 0.0000 0.0000 0.0002 0.0000 0.0000 0.0000 Table 329: Optimized Parameters of NGAC for System (3.5) (Small 0 d X Range) Table 330 lists the randomly sampled performance of the nonlinear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. The nonlinear LARC’s minimum rate of convergence is about twice as fast as the nonlinear GAC’s, but the 97 nonlinear LARC’s maximum control effort is over twice as much as the nonlinear GAC’s. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.2230 12.9758 6.7768 0.5439 0.2201 0.1311 2 W=[1 1] 0.0416 12.9448 6.7075 0.5349 0.2216 0.1311 3 W=[0 1] 0.1084 12.9666 6.7272 0.5400 0.2256 0.1338 Table 330: Performance of NGAC on System (3.5) (Small 0 d X Range) Table 331 tabulates the randomly sampled performance of the nonlinear GAC on the linearized double inverted pendulum system (3.6) using a small 0 d X range. As with the linear GAC, the linearized dynamics are close to the nonlinear dynamics locally decreasing the effect of the system nonlinearities. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.2890 13.0663 6.9025 0.5251 0.2170 0.1291 2 W=[1 1] 0.1984 12.9550 6.7552 0.5317 0.2176 0.1283 3 W=[0 1] 0.0433 12.9104 6.8273 0.5330 0.2215 0.1303 Table 331: Performance of NGAC on Linearized System (3.6) (Small 0 d X Range) Figure 372 through Figure 377 contain the system signal responses to the initial condition (0.5,0,0,0) for the nonlinear GAC on the double inverted pendulum system (3.5) using a small 0 d X range. As with the nonlinear LARC, the nonlinear GAC for the first two weight settings yields a choppy control response that in turn causes the response be choppy. The weight setting W=[0 1] is ideal and is a major improvement over that of the nonlinear LARC. 98 Figure 372: States of System (3.5) Using NGAC (W=[1 0], Small 0 d X Range) Figure 373: Control Signal of System (3.5) Using NGAC (W=[1 0], Small 0 d X Range) 99 Figure 374: States of System (3.5) Using NGAC (W=[1 1], Small 0 d X Range) Figure 375: Control Signal of System (3.5) Using NGAC (W=[1 1], Small 0 d X Range) 100 Figure 376: States of System (3.5) Using NGAC (W=[0 1], Small 0 d X Range) Figure 377: Control Signal of System (3.5) Using NGAC (W=[0 1], Small 0 d X Range) 101 3.3.5 Nonlinear GAC with Large Checking Set Range Table 332 shows the genetic algorithm parameters used to optimize the nonlinear GAC for the double inverted pendulum system (3.5) using a large 0 d X range. The checking set range is set high to demonstrate the affects of the system nonlinearities. The control gain upper limit, Kmax, is set to 10. System: Double Inverted Pendulum (3.5) Controller Type: nonlinear Generations: 50 Population Size: 50 Checking points: 200 Critical Points: 50 E Δ : 10 max K : 10 c: 0 X0 Range: [ ] 1,2,3,4 1 1 = ∈ − i xi Table 332: GA Parameters to Optimize NGAC for System (3.5) (Large 0 d X Range) Table 333 displays the GA results of the optimization of the nonlinear GAC for the double inverted pendulum system (3.5) using a large X 0 range. Weight Case Weights P K Q 1 W=[1 0] 0.1009 0.2370 0.0322 0.0363 0.2370 1.1231 0.1154 0.1678 0.0322 0.1154 0.0131 0.0174 0.0363 0.1678 0.0174 0.0251 0.0237 3.8883 0.2727 0.5673 0.0186 0.0019 0.0002 0.0008 0.0019 0.0002 0.0000 0.0000 0.0002 0.0000 0.0000 0.0001 0.0008 0.0000 0.0001 0.0002 2 W=[1 1] 0.2060 0.4427 0.0577 0.0670 0.4427 1.6272 0.1759 0.2432 0.0577 0.1759 0.0204 0.0264 0.0670 0.2432 0.0264 0.0364 0.2423 4.6634 0.3609 0.6830 0.2426 0.0081 0.0009 0.0009 0.0081 0.0003 0.0000 0.0000 0.0009 0.0000 0.0000 0.0000 0.0009 0.0000 0.0000 0.0000 3 W=[0 1] 0.1272 0.2925 0.0390 0.0445 0.2925 1.3032 0.1371 0.1950 0.0390 0.1371 0.0158 0.0207 0.0445 0.1950 0.0207 0.0292 0.0807 4.1851 0.3027 0.6116 0.0677 0.0005 0.0006 0.0001 0.0005 0.0001 0.0002 0.0000 0.0006 0.0002 0.0015 0.0003 0.0001 0.0000 0.0003 0.0001 Table 333: Optimized Parameters of NGAC for System (3.5) (Large 0 d X Range) Table 334 lists the randomly sampled performance of the nonlinear GAC on the double inverted pendulum system (3.5) using a large 0 d X range. 102 Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1073 9.7520 3.4343 45061 11.004 332.63 2 W=[1 1] 0.1202 10.819 3.6796 9902.3 9.0307 107.42 3 W=[0 1] 0.0798 9.9874 3.2041 9128.9 8.2549 92.792 Table 334: Performance of NGAC on System (3.5) (Large 0 d X Range) Table 335 tabulates the randomly sampled performance of the nonlinear GAC on the linearized double inverted pendulum system (3.6) using a large 0 d X range. In theory, the nonlinear GAC should yield a positive min γ for the linearized double inverted pendulum system since the nonlinear GAC becomes the LQR for Q in Table 333. However, weight case 3 yielded a P with an eigenvalue of 0.00001313 due to Q having eigenvalues so close to zero, which caused min γ to have a small negative value. Setting c>0 forces the eigenvalues of Q to be no less than c and alleviates the problem. Weight Case Weights min γ γ μ γ σ max u u μ u σ 1 W=[1 0] 0.1505 12.9276 6.6813 5.0258 2.1126 1.2399 2 W=[1 1] 0.0933 13.1822 6.9588 5.9003 2.3706 1.3879 3 W=[0 1] 0.0045 12.9970 6.8068 4.6618 1.9632 1.1494 Table 335: Performance of NGAC on System (3.6) (Large 0 d X Range) Figure 378 through Figure 383 contain the system signal responses to the initial condition (0.5,0,0,0) for the nonlinear GAC on the double inverted pendulum system (3.5) using a large 0 d X range. All three sets of weights yield controller chattering. The cause is attri 



A 

B 

C 

D 

E 

F 

I 

J 

K 

L 

O 

P 

R 

S 

T 

U 

V 

W 


