For this derivation, we employ two methods. The direct method, similar to that of Hartberger (1973), is the consequence of directly integrating (C.8). The indirect method avoids this integration by a trick which is instructive.
Direct Method. Integrating (C.8) we get δx(T) =δx(τ) +
T
τ
fx[x∗(t), u∗(t), t]δx(t)dt, (C.9) where the initial condition δx(τ) is given in (C.5).
Sinceδx(T) is the change in the terminal state from the optimal state x∗(T),the change in the objective functionδJ must be negative. Thus,
δJ =cδx(T) =cδx(τ) + T
τ
cfx[x∗(t), u∗(t), t]δx(t)dt≤0. (C.10) Furthermore, since (C.8) is a linear homogeneous differential equation, we can write its general solution as
δx(t) = Φ(t, τ)δx(τ), (C.11) where the fundamental solution matrix or the transition matrix Φ(t, τ)∈ En×n obeys
d
dtΦ(t, τ) =fx[x∗(t), u∗(t)t]Φ(t, τ), Φ(τ , τ) =I, (C.12) where I is ann×nidentity matrix; see Appendix A.
Substituting forδx(t) from (C.11) into (C.10), we have δJ =cδx(τ) +
T
τ
cfx[x∗(t), u∗(t), t]Φ(t, τ)δx(τ)dt≤0. (C.13)
C.2. Derivation of Adjoint Equation and the Maximum Principle 437 This induces the definition
λ∗(t) = T
τ
cfx[x∗(t), u∗(t), t]Φ(t, τ)dt+c, (C.14) which when substituted into (C.13), yields
δJ =λ∗(τ)δx(τ)≤0. (C.15) Butδx(τ) is supplied in (C.5). Noting thatε >0,we can rewrite (C.15) as
λ∗(τ)f[x∗(τ), v, τ]−λ∗(τ)f[x∗(τ), u∗(τ), τ]≤0. (C.16) Defining the Hamiltonian for the Mayer form as
H[x, u, λ, t]=λf(x, u, t), (C.17) we can rewrite (C.16) as
H[x∗(τ), u∗(τ), λ(τ), τ]≥H[x∗(τ), v, λ(τ), τ]. (C.18) Since this can be done for almost every τ ,we have the required Hamil- tonian maximizing condition.
The differential equation form of the adjoint equation (C.14) can be obtained by taking its derivative with respect to τ .Thus,
dλ(τ)
dτ =
T
τ
cfx[x∗(t), u∗(t), t]dΦ(t, τ) dτ dt
−cfx[x∗(τ), u∗(τ), τ]. (C.19) It is also known that the transition matrix has the property:
dΦ(t, τ)
dτ =−Φ(t, τ)fx[x∗(τ), u∗(τ), τ], which can be used in (C.19) to obtain
dλ(τ) dτ =−
T
τ
cfx[x∗(t), u∗(t), t]Φ(t, τ)fx[x∗(τ), u∗(τ), τ]dt
−cfx[x∗(τ), u∗(τ), τ]. (C.20) Using the definition (C.14) of λ(τ) in (C.20), we have
dλ(τ)
dτ =−λ(τ)fx[x∗(τ), u∗(τ), τ]
438 C. An Alternative Derivation of the Maximum Principle with λ(T) =c,or using (C.17) and noting thatτ is arbitrary, we have
λ˙ =−λfx[x∗, u∗, t] =−Hx[x∗, u∗, λ, t), λ(T) =c. (C.21) This completes the derivation of the maximum principle along with the adjoint equation using the direct method.
Indirect Method. The indirect method employs a trick which simpli- fies considerably the derivation. Instead of integrating (C.8) explicitly, we now assume that the result of this integration yields cδx(T) as the change in the state at the terminal time. As in (C.10), we have
δJ =cδx(T)≤0. (C.22)
First, we define
λ(T)=c, (C.23)
which makes it possible to write (C.22) as
δJ =cδx(T) =λ(T)δx(T)≤0. (C.24) Note parenthetically that if the objective functionJ =S(x(T)),we must define λ(T) =∂S[x(T)]/∂x(T) giving us
δJ = ∂S[x(T)]
∂x(T) δx(T) =λ(T)δx(T).
Now, λ(T)δx(T) is the change in the objective function due to a change δx(T) at the terminal time T. That is, λ(T) is the marginal return or the marginal change in the objective function per unit change in the state at timeT. Butδx(T) cannot be known without integrating (C.8). We do know, however, the value of the change δx(τ) at time τ which caused the terminal change δx(T) via (C.8).
We would therefore like to pose the problem of obtaining the change δJ in the objective function in terms of the known value δx(τ); see Fel’dbaum (1965). Simply stated, we would like to obtain the marginal returnλ(τ) per unit change in state at time τ .Thus,
λ(τ)δx(τ) =δJ =λ(T)δx(T)≤0. (C.25) Obviously, knowingλ(τ) will make it possible to make an inference about δJ,which is directly related to the needle-shaped variation applied in the small interval (τ−ε, τ].
C.2. Derivation of Adjoint Equation and the Maximum Principle 439 However, since τ is arbitrary, our problem of finding λ(τ) can be translated to one of finding λ(t), t∈[0, T],such that
λ(t)δx(t) =λ(T)δx(T), t∈[0, T], (C.26) or in other words,
λ(t)δx(t) = constant, λ(T) =c. (C.27) It turns out that the differential equation whichλ(t) must satisfy can be easily found. From (C.27),
d
dt[λ(t)δx(t)] =λδx
dt + ˙λδx= 0, (C.28) which after substituting for dδx/dt from (C.8) becomes
λfxδx+ ˙λδx= (λfx+ ˙λ)δx= 0. (C.29) Since (C.29) is true for arbitrary δx,we have
λ˙ =−λfx=−Hx (C.30)
using the definition (C.17) for the Hamiltonian.
The Hamiltonian maximizing condition can be obtained by substi- tuting forδx(τ) from (C.5) into (C.25). This is the same as what we did in (C.15) through (C.18).
The purpose of the alternative proof was to demonstrate the valid- ity of the maximum principle for a simple problem without knowledge of any return function. For more complex problems, one needs compli- cated mathematical analysis to rigorously prove the maximum principle without making use of return functions. A part of mathematical rigor is in proving the existence of an optimal solution without which necessary conditions are meaningless; see Young (1969).
Appendix D
Special Topics in Optimal Control
In this appendix we will discuss a number of specialized topics in seven sections. These are the Kalman and Kalman-Bucy filters, the Weiner Process, Itˆo’s Lemma, linear-quadratic problems, second-order varia- tions, singular control, and the Sethi-Skiba points. These topics are referred to but not discussed in the main body of the text. While we will not be able to go into great detail, we will provide an adequate descrip- tion of these topics for our purposes. For further details, the reader can consult the references cited in the respective sections dealing with these topics.