Thus, it makes 4 “In the classical teaching of calculus, this idea is immediately obscured by the accidental fact that, on a one-dimensional vector space, there is a one-to-one correspon
Trang 1Chapter K
Differential Calculus
In the second half of this book, starting from Chapter F, we have worked on developing
a thorough understanding of function spaces, may it be from a geometric or analyticviewpoint This work allows us to move towards a variety of directions In particular,
we can now extend the classical differential calculus methods to the realm of mapsdefined on suitable function spaces, or more generally, on normed linear spaces Inturn, this “generalized” calculus can be used to develop a theory of optimization
in which the choice objects need not be n-vectors, but members of an arbitrarynormed linear space (as in calculus of variations, control theory and/or dynamicprogramming) This task is carried out in the present chapter
We begin with a quick retake on the notion of derivative (of a real-to-real function),and point to the fact that there are advantages of viewing this notion as a particularlinear functional, as opposed to a number Once this point is understood, it becomesstraightforward to extend the notion of “derivative” to the context of functions whosedomains and codomains lie within arbitrary normed linear spaces Moreover, theresulting derivative concept, called the Fréchet derivative, inherits many properties
of the derivative that you are duly familiar with from classical calculus We study thisconcept in fair detail here, go through several examples, and extend some well-knownresults of calculus to this realm, such as the Chain Rule, the Mean Value Theorem,etc Keeping an eye on optimization theoretic applications, we also revisit the theory
of concave functions, this time making use of Fréchet derivatives.1
The “use” of this work is demonstrated by means of a brief introduction to (infinitedimensional) optimization theory Here we see how one can easily extend first- andsecond-order conditions for local extremum of real functions on the line to the broadercontext of real maps on normed linear spaces We also show how useful concavefunctions are in this context as well As for an application, and as a final of business inthis text, we sketch a precursory, but rigorous, introduction to calculus of variations,and consider a few of its economic applications.2
1 For reasons that largely escape me, most texts on functional analysis do not cover differential calculus on normed linear spaces A thorough treatment of this topic can be found in Dieudonné (1969), but you may well find that exposition a little “heavy.” Some texts on optimization theory (such as Luenberger (1969)) do contain a discussion of Fréchet differentiation, but rarely develop the theory to the extent that we do here The best reference I know about differential calculus on normed linear spaces is a little book by Cartan (1972) That book is unfortunately out of print at present — I was able to read it, I’m proud to say, from its Spanish edition — but if you can get a hold
of it, you would be in for a treat.
2 Due to space constraints, I don’t go into control theory here even though this is among the dard methods of dynamic economic analysis in continuous time To a careful eye, it will be evident that the machinery developed here can as well be used to go deep into constrained optimization over Banach spaces, from which point control theory is within a stone’s throw.
Trang 2stan-1 Fréchet Differentiation
So far in this text we have worked almost exclusively with limits of sequences Itwill be convenient to depart from this practice in this chapter, and work instead withlimits of functions The basic idea is a straightforward generalization of one that youare surely familiar with from calculus (Section A.4.2)
Let T be a nonempty subset of a normed linear space X (whose norm is · ), and
Φ a function that maps T into a normed linear space Y (whose norm is · Y) Let
x∈ X be the limit of at least one sequence in T \{x}.3
A point y ∈ Y is said to be the limit of Φ at x, in which case we say “Φ(ω)approaches to y as ω → x, ” provided that Φ(xm)→ y holds for every sequence (xm)
in T \{x} with xm → x This situation is denoted as
limω→xΦ(ω) = y
Clearly, we have limω→xΦ(ω) = y iff, for each ε > 0, there exists a δ > 0 such that
y− Φ(ω) Y < ε for all ω ∈ T \{x} with ω − x < δ (Yes?) Thus,
limω→xΦ(ω) = y iff lim
ω→xΦ1(ω) + lim
ω→xΦ2(ω)for any α ∈ R If Y = R, and limω→xΦ1(ω)and limω→xΦ2(ω) are real numbers, then
we also have
limω→xΦ1(ω)Φ2(ω) = lim
Φ 1 (ω)−Φ 2 (ω) ω−x = 0
So, if Φ1 and Φ2 are tangent at x, then not only that Φ1(x) = Φ2(x), but also, as
ω → x, the distance between the values of these functions (i.e Φ1(ω)− Φ2(ω) Y)converges to 0 “faster” than ω approaches to x Put another way, near x, the values
of Φ2 approximates Φ1(x) better than ω approximates x from the same distance Inthis sense, we can think of Φ2 as a “best approximation” of Φ1 near x
3 Of course, if x ∈ int X (T ), this condition is automatically satisfied (Yes?)
Trang 31.2 What is a Derivative?
In calculus one is taught that the derivative of a function f : R → R at a point x is
a “number” that describes the rate of “instantaneous” change of the value of f as
x changes Yet, while useful for certain applications, this way of looking at thingsfalls short of reflecting the intimate connection between the notion of the derivative
of f at x and “the line that best approximates f near x.” We begin our discussion
by recalling this interpretation.4
Let O be an open subset of R, x ∈ O, and f ∈ RO Suppose f is differentiable at
x, that is, there is a real number f (x) with
f (t)−f(x)−f (x)(t−x)
|t−x| = 0
Put differently, if f is differentiable at x, then there exists a linear functional L on
R, namely, a → f (x)a, such that the affine map t → f(x) + L(t − x) is tangent to
f at x The converse is also true Indeed, if L : R → R is a linear functional suchthat t → f(x) + L(t − x) is tangent to f at x, then, there exists an α ∈ R such thatL(a) = αa for all a ∈ R, and
limt→x
f (t)−f(x) t−x − α
= limt→x
f (t)−f(x)−α(t−x)
It follows that f is differentiable at x, and f (x) = α
This elementary argument establishes that f is differentiable at x iff there is alinear functional L on R such that
limt→x
f (t)−f(x)−L(t−x)
that is, the affine map t → f(x) + L(t − x) is tangent to f at x — it approximates faround x so well that, as t → x, the error f(t) − f(x) − L(t − x) of this approximationdecreases to 0 faster than t tends to x
∗ ∗ ∗ ∗ FIGURE K.1 ABOUT HERE ∗ ∗ ∗ ∗This is the key idea behind the very concept of differentiation: The local behavior
of a differentiable function is linear just like that of an affine map Thus, it makes
4 “In the classical teaching of calculus, this idea is immediately obscured by the accidental fact that, on a one-dimensional vector space, there is a one-to-one correspondence between linear func- tionals and numbers, and therefore the derivative at a point is defined as a number instead of a linear functional This slavish subservience to the shibboleth of numerical interpretation at any cost becomes much worse when dealing with functions of several variables ” Dieudonné (1968, p.147).
Trang 4sense to consider the linear functional L of (3) as central to the notion of derivative
of f at x In fact, it would be more honest to refer to L itself as the “derivative” of f
at x From this point of view, the number f (x) is simply the slope of L — it is noneother than a convenient way of identifying this linear functional
Okay, what’s the big deal? It may seem like we are fighting over semantics here.What difference does it make if we instead viewed the linear functional t → f (x)t
as the derivative of f at x, instead of the number f (x)? Well, think about it Howwould you define the derivative of f at x if this function is defined on an open subset
O of R2? The classical definition (2), which formalizes the notion of “rate of change,”immediately runs into difficulties in this situation But the idea of “finding an affinemap which is tangent to f at x” survives with no trouble whatsoever All we have
to do is to define the derivative of f at x as the linear functional L on R2 with theproperty that
limt→x
f (t)−f(x)−L(t−x) t−x 2 = 0
Geometrically speaking, the graph of the affine map t → f(x) +L(t− x) is none otherthan the hyperplane tangent to the graph of f at x (Figure 2)
∗ ∗ ∗ ∗ FIGURE K.2 ABOUT HERE ∗ ∗ ∗ ∗Looking at the derivative of a function the “right” way saves the day in manyother circumstances Since the notion of tangency is well-defined for functions thatmap a normed linear space into another, this point of view remains meaningful forany such function, and paves the way toward the general theory we are about to layout.5
Dhilqlwlrq Let X and Y be two normed linear spaces, and T a subset of X Forany x ∈ intX(T ), a map Φ : T → Y is said to be Fréchet differentiable at x ifthere is a continuous linear operator DΦ,x ∈ B(X, Y ) such that
limω→x
Φ(ω)−Φ(x)−D Φ,x (ω−x)
The linear operator DΦ,x is called the Fréchet derivative of Φ at x.6
5 Part of this theory can be developed within the context of metric linear spaces as well However, I will work exclusively with normed linear spaces in this chapter, as the applied strength of differential calculus on metric linear spaces is nowhere near that on normed linear spaces.
6 Just to be on the safe side, let me remind you that 0 in (4) is the origin of Y Put differently, (4) means
lim
ω →x Φ(ω) −Φ(x)−D Φ,x (ω −x) Y
Trang 5If O is a nonempty open subset of T , and if Φ is Fréchet differentiable at every
x ∈ O, then we say that Φ is Fréchet differentiable on O If O = intX(T ) here,then Φ is said to be Fréchet differentiable Finally, we say that Φ is continuouslyFréchet differentiable if it is Fréchet differentiable and the map DΦ :intX(T ) →B(X, Y ), defined by DΦ(x) := DΦ,x, is continuous
The idea should be clear at this point Just as in plain ol’ calculus, we perturb
x∈ intX(T ) infinitesimally, and look at the behavior of the difference-quotient of Φ
Of course, here we can perturb x in all sorts of different ways Indeed, since intX(T )
is open in X, any ω ∈ X can be thought of as a perturbation of x, provided that
ω− x is small enough Thus, the local linear behavior of Φ at x must be captured
by a linear operator defined on the entire X The Fréchet derivative of Φ at x is,then, a linear operator DΦ,x defined on X, and one that ensures that the (globallylinear) behavior of the affine map Φ(x)+DΦ,x(ω−x) approximates the (locally linear)behavior of Φ at x accurately (See Proposition 2 below.)
Before we jump into examples, there are a few matters to clarify First, we need
to justify why we call DΦ,x “the” Fréchet derivative of Φ at x How do we know thatthere is a unique DΦ,x in B(X, Y ) that satisfies (4)? To settle this matter, take anytwo K, L ∈ B(X, Y ) and suppose that (4) holds both with DΦ,x = K and DΦ,x = L
We must then have
limω→x
(K−L)(ω−x)
(Yes?) Since intX(T ) is open, this is equivalent to say that
limν→0
(K−L)(ν)
ν = 0
(Why?7) It follows that
0= limm→∞
(K−L)(m1y)
1
m y = (K−L)(y)y for all y ∈ X,
so we have K = L Conclusion: When it exists, the Fréchet derivative of a function
at any given point in the interior of its domain is unique
The second issue we should discuss is why we define the Fréchet derivative of afunction at a point as a continuous linear operator Intuitively speaking, the mainreason for this is that the notion of tangency makes geometric sense only when thefunctions involved are continuous at the point of tangency So, at least at the pointthat we wish to define the derivative of the function we better ask the linear operator
that is, for every ε > 0, there exists a δ > 0 such that Φ(ω) − Φ(x) − D Φ,x (ω − x) Y ≤ ε ω − x for each ω ∈ N δ,X (x).
7 Be careful here If x did not belong to the interior of T (in X), the former equation would not imply the latter.
Trang 6we seek to be continuous But, of course, this is the same thing as asking for the tinuity of that operator everywhere (why?), and hence we ask the Fréchet derivative
con-of a function at a point to be continuous.8 By the way, as a major side benefit of thisrequirement, we are able to maintain the familiar rule
differentiability =⇒ continuity
Proposition 1 Let X and Y be two normed linear spaces, T a subset of X, and
x∈ intX(T ) If Φ ∈ YT is Fréchet differentiable at x, then Φ is continuous at x.Proof Take any Φ ∈ YT such that (4) holds for some DΦ,x∈ B(X, Y ) Then
limω→x(Φ(ω)− Φ(x) − DΦ,x(ω− x)) = 0
Since DΦ,x is continuous, we have limω→xDΦ,x(ω− x) = DΦ,x(0) = 0 It follows that
Finally, let us mention two alternate formulations of the definition of the Fréchetderivative at a given point x in the interior of the domain of a map Φ ∈ YT Imme-diate in this regard is the observation that, by changing variables, we can write (4)equivalently as
limν→0
Φ(x+ν)−Φ(x)−D Φ,x (ν)
(Yes?) When convenient, this formulation can be used instead of (4)
Our second reformulation stems from the fact that, just as in the case of variable calculus, the derivative notion that we consider here corresponds to a bestlocal approximation of a given function This is not entirely trivial, so we state it inprecise terms
one-Proposition 2 Let X and Y be two normed linear spaces, T a subset of X,
x∈ intX(T ), and L ∈ B(X, Y ) For any Φ ∈ YT, L is the Fréchet derivative of Φ at
x if, and only if, there exists a continuous map e ∈ YT such that
Φ(ω) = Φ(x) + L(ω− x) + e(ω) for all ω ∈ intX(T ), and lim
ω→x
e(ω) ω−x = 0
8 This issue never arises in classical calculus, as any linear operator from a Euclidean space into another is, per force, continuous.
9 Just so that we’re on safe grounds here, let me note that I got limω→xΦ(ω) = Φ(x) from lim
ω →x (Φ(ω) − Φ(x)) = limω
→x (Φ(ω) − Φ(x) − D Φ,x (ω − x)) + limω
→x DΦ,x(ω − x) = 0.
Trang 7So, if Φ ∈ YT is Fréchet differentiable at x ∈ intX(T ), then the affine map ω →Φ(x) + DΦ,x(ω− x) is a best approximation of Φ at x in the sense that, as ω → x, the
“size of the error” involved in this approximation, that is e(ω) Y , vanishes fasterthan ω − x goes to zero
Exercise 1 Prove Proposition 2.
Exercise 2 For any two normed linear spaces X and Y, show that DL,x = Lfor anyL∈ B(X, Y )andx∈ X.(That is, the Fréchet derivative of a linear operator at any point in its domain equals the linear operator itself Wouldn’t you expect this?)
continuous bilinear functional (Exercise J.52) Show thatϕis Fréchet differentiable, and, for any(x∗, y∗)∈ X × Y,
Dϕ,(x,y)(z, w) = ϕ(z, y∗) + ϕ(x∗, w).
Exercise 4 Let X and Y be two normed linear spaces, O a nonempty open subset
ofX, andΦ∈ YO a Fréchet differentiable map Show thatΦwould remain Fréchet differentiable if we replaced the norms ofX andY by equivalent norms, respectively (Recall Section J.4.2.) Moreover, the Fréchet derivative of Φ would be the same in both cases.
Exercise 5 (The Gateaux Derivative) Let X and Y be two normed linear spaces and x∈ X The map Φ∈ YX is said to be Gateaux differentiable at xif there exists an L∈ B(X, Y )such that
HereLis called the Gateaux derivative of Φatx.(The idea is the generalization
of that behind the notion of directional derivatives.)
(a) Show that, when it exists, the Gateaux derivative ofΦ atxis unique.
(b) Prove: If Φ is Fréchet differentiable at x, then it is Gateaux differentiable atx,
and its Gateaux derivative at xequals DΦ,x
Trang 8E{dpsoh 1 [1] Let I be an open interval and x ∈ I The Fréchet derivative of adifferentiable function f ∈ RI
at x is the linear map t → f (x)t on R, that is,
Df,x(t) = f (x)t for all t ∈ R
Thus the Fréchet derivative of f at x is exactly what we argued in Section 1.2 the
“derivative of f at x” should mean The number f (x) serves only to identify thelinear map a certain shift of which gives us a best approximation of f near x
We observe here that any differentiable function f ∈ RI is Fréchet differentiable,and Df : I → B(R, R) satisfies
Df(x)(t) = f (x)t for all x ∈ I and t ∈ R
[2]Take any m ∈ N and any open interval I If Φ : I → Rmis Fréchet differentiable
at x ∈ I, then DΦ,x is a linear operator from R into Rm given by
DΦ,x(t) = (Φ1(x)t, , Φm(x)t) for all − ∞ < t < ∞,where Φi ∈ RI is the ith component map of Φ, i = 1, , m This is a special case of
a result we shall prove shortly
Reminder Given any n ∈ N, let S be a subset of Rn with nonempty interior.Where ej
denotes the jth unit vector in Rn
, the jth partial derivative of ϕ ∈ RS
[3] Let n ∈ N and take any open subset O of Rn
As we show below, if x ∈ O and
ϕ∈ RO is Fréchet differentiable at x, then all partial derivatives of ϕ at x exist, and
we have
Dϕ,x(t1, , tn) =nj=1∂jϕ(x)tj for all (t1, , tn)∈ Rn.Thus, the Fréchet derivative of ϕ at x is none other than the linear functional thatcorresponds to the n-vector (∂1ϕ(x), , ∂nϕ(x)), which, as you know, is called thegradientof ϕ at x
[4] We now generalize Given any m, n ∈ N, let O be an open subset of Rn, andfix any x ∈ O Take any Φi ∈ RO
, i = 1, , m, and define Φ : O → Rm byΦ(t1, , tn) := (Φ1(t1, , tn), , Φm(t1, , tn))
(Here Φis are component maps of Φ.) If Φ is Fréchet differentiable at x, then thepartial derivatives of each Φi ∈ RO at x exist, and we have
DΦ,x(t1, , tn) =n
j=1∂jΦ1(x)tj, ,nj=1∂jΦm(x)tj
for all (t1, , tn)∈ Rn,
Trang 9where ∂jΦi(x)is the jth partial derivative of Φi at x, i = 1, , m, j = 1, , n.10 Or,put differently, the linear operator DΦ,x satisfies
where Jx
Φ is the Jacobian matrix of Φ at x, that is, Jx
Φ := [∂jΦi(x)]m×n.Just like f (x) in [1] turned out to be the number that identifies the Fréchetderivative of f at x, we see here that the Jacobian matrix of Φ at x identifies theFréchet derivative of Φ at x A certain shift of this operator, namely, (Φ(x)−DΦ,x(x))+
DΦ,x, is an affine map from Rn into Rm that best approximates Φ near x
To prove (6), observe that, since DΦ,x ∈ L(Rn
, Rm), there must exist a matrix
A:= [aij]m×n with DΦ,x(y) = Ay for all y ∈ Rn (Example F.6) Then, by definition,
limω→x
Φ i (ω)−Φ i (x)−nj=1 a ij (ω j −x j )
ω−x 2 = 0, i = 1, , m
It follows that
limε→0
Φy for all y ∈ Rn
Exercise 7 True or false: f : R → Ris continuously differentiable iff it is continuously Fréchet differentiable.
Exercise 8 Defineϕ : R2 → Rbyϕ(0) := 0andϕ(x) := x1 x 2
x 2 for allx = 0.Show thatϕis continuous and both ∂1ϕ(·)and ∂2ϕ(·)are well-defined onR2 , whereasϕ
is not Fréchet differentiable at0.
∗ Exercise 9.H Let n ∈ N, and O a nonempty open and convex subset of Rn Take any ϕ ∈ C(O) such that x → ∂iϕ(x) is a continuous function on O, i = 1, , n
Show thatϕis Fréchet differentiable.
Exercise 10 State and prove a generalization of the previous result, which applies to continuous maps from a nonempty open and convex subset ofRn intoRm, n, m∈ N
Exercise 11 Given anyn∈ N, let Xi be a normed linear space,i = 1, , n, and O
a nonempty open subset of the product normed linear space X := XnXi Fix any
x∈ O,and, for eachi, letOi :={zi
∈ Xi : (zi, x−i)∈ O},which is an open subset
of Xi Prove: If ϕ : O → R is Fréchet differentiable at x, thenϕ(·, x−i) ∈ ROi is Fréchet differentiable atxi, i = 1, , n,and we have
D ϕ,x (z1, , zn) =ni=1Dϕ(·,x−i ),x i (zi).
(Compare with Example 1.[3].)
10 Please note that I do not at all claim here that Φ is necessarily Fréchet differentiable at x when the partial derivatives of each Φi∈ R O at x exist This is, in fact, not true, simply because continuity
of a map on R n in each of its components does not imply the overall continuity of the map (See Exercises 8 and 9 below.)
Trang 10The rest of the examples considered here all work within the context of infinitedimensional normed linear spaces We begin with a particularly simple one, and movetowards more involved examples.
E{dpsoh 2 Define ϕ : C[0, 1] → R by
ϕ(f ) :=
] 1 0
] 1 0
f (t)(h(t)− f(t))dt +
] 1 0(h(t)− f(t))2dt
for any h ∈ C[0, 1] Notice that, as h − f ∞ → 0, the last term here would approach
to 0 faster than h− f ∞ vanishes Indeed, if we define e ∈ RC[0,1] by e(h) :=
U1
0(h(t)− f(t))2dt, then
limh→f
Dϕ(f )(g) = 2
] 1 0
f (t)g(t)dt
Exercise 12.H Show that the mapϕ∈ RC[0,1] defined byϕ(f ) := f (0)2 is Fréchet differentiable, and compute Dϕ
Exercise 13.H Show that the map ϕ ∈ RC[0,1] defined by ϕ(f ) := 13U1
0 f (t)3dt is Fréchet differentiable, and compute Dϕ
One-variable differential calculus is often useful in computing the Fréchet tive of a given function, even when the domain of this function is infinite dimensional
deriva-In particular, Taylor’s Theorem is extremely useful for this purpose Given our presentpurposes, all we need is the following “baby” version of that result
The Second Mean Value Theorem.Let a and b be two distinct real numbers, and
I :=co{a, b} If f : I → R is continuously differentiable on I and twice differentiable
on intR(I), then
Trang 11f (b)− f(a) = f (a)(b − a) +12f (c)(b− a)2 for some c ∈ I\{a, b}.
Proof The idea of the proof is reminiscent of the usual way in which one deducesthe Mean Value Theorem from Rolle’s Theorem (Exercise A.56) Define g : I → Rby
g(t) := f (b)− f(t) − f (t)(b − t) −12M (b− t)2,where M ∈ R is chosen to guarantee that g(a) = 0 Clearly, g is differentiable onintR(I),and a quick computation yields
g (t) = (b− t)(M − f (t)) for any t ∈ I\{a, b}
Moreover, since g(a) = 0 = g(b) and g ∈ C1(I), Rolle’s Theorem guarantees that
g (c) = 0 for some c ∈ I\{a, b} But then M = f (c), and we find
0 = g(a) = f (b)− f(a) − f (a)(b − a) − 12f (c)(b− a)2,
m)∈ ∞ such that
u(ωi)− u(xi) = u (xi)(ωi− xi) + 12u (ω∗i)(ωi− xi)2and ω∗
i ∈ co{ωi, xi} for each i = 1, 2, Consequently,
ϕ((ωm))− ϕ((xm)) =∞i=1δiu (xi)(ωi− xi) +12∞i=1δiu (ω∗i)(ωi− xi)2 (7)for any (ωm)∈ ∞.11 Again, the trick here is to notice that, as (ωm)→ (xm),the lastterm of this equation would approach to 0 faster than (ωm− xm) ∞ vanishes Since
we assume that u is bounded here, say by the real number M > 0, it is very easy toshow this Define e : ∞ → R by e((ωm)) := 12∞δiu (ω∗i)(ωi− xi)2,and note that
Trang 12for any (ωm)∈ ∞ It follows that lim(ω m )→(x m )
e((ωm )) (ω m −x m ) ∞
= 0, as desired Hence,
by Proposition 2, we may conclude that
Dϕ,(x m )((ym)) =∞i=1δiu (xi)yi for all (ym)∈ ∞.What does this mean, intuitively? Well, consider the map ψ : ∞→ R defined by
ψ((ym)) := ϕ((xm)) +∞i=1δiu (xi)(yi− xi)
This is an affine map, so its behavior is globally linear And we have just foundout that, near (xm), the behavior of ψ and our original (nonlinear) map ϕ are “verysimilar,” in the sense that these two maps are tangent to each other at (xm) So, thelocal linear behavior of ϕ around (xm)is best captured by the linear behavior of the
h−f ∞ → 0 as h → f, then we may conclude that
But
e(h) ∞h−f ∞ ≤ 12 H ◦ θh ∞ h− f ∞,
so all we need here is to show that H ◦ θh ∞ is uniformly bounded for any h which
is sufficiently close to f This is quite easy Obviously, we have θh ∞ ≤ f ∞+ 1for any h ∈ N1,C[0,1](f ) Moreover, since H is continuous, there is a number M > 0such that |H (a)| ≤ M for all a ∈ R with |a| ≤ f ∞+ 1 Thus:
e(h) ∞h−f ∞ ≤ M
Trang 13∗ Exercise 14 (The Nemyitski˘ı Operator ) LetH : R2 → Rbe a continuous function such that∂2H and ∂2∂2H are continuous functions onR2 Define the self-mapΦon
C[0, 1] by Φ(f )(t) := H(t, f (t)).Show thatΦ is Fréchet differentiable, and
DΦ(f )(g)(t) = ∂2H(t, f (t))g(t) for all f, g ∈ C[0, 1] and 0 ≤ t ≤ 1.
An important topic in functional analysis concerns the determination of normed linear spaces the norms of which are Fréchet differentiable We will have little to say on this topic in this book, but the following exercises might give you at least an idea about it.
Exercise 15 Let (xm)∈ 2
\{0}, and show that · 2 : 2
→ R+ is Fréchet tiable at(xm)with
differen-D ·
2 ,(x m ) (ym) = (x1
m ) 2
∞ i=1 xiyi for all (ym) ∈ 2 The following exercise generalizes this observation.
Exercise 16 Let X be a pre-Hilbert space (Exercise J.12) with the inner product φ
A famous theorem of linear analysis states that for any continuous linear functionalL
onX,there exists ay∈ Xsuch thatL(x) = φ(x, y)for allx∈ X.(This is the Riesz Representation Theorem.) Assuming the validity of this fact, show that the norm ·
ofX is Fréchet differentiable at eachx∈ X\{0},and we haveD · ,x(y) = φ(x,y)x for all x∈ X\{0} and y∈ X
Exercise 17 It is well known that for any L ∈ B( 1, R)there exists an (am) ∈ ∞
such thatL((xm)) =∞aixi.(You don’t have to prove this result here.) Use this fact
to show that the norm · 1 on 1 is not Fréchet differentiable anywhere.
∗ Exercise 18.H Determine all points at which · ∞ : ∞ → R+ is Fréchet tiable.
Most of the basic rules of differentiation of one-variable calculus have straightforwardgeneralizations in terms of Fréchet derivatives Just as you would suspect, for in-stance, the Fréchet derivative of a given linear combination of Fréchet differentiableoperators equals that linear combination of the Fréchet derivatives of the involvedoperators
Proposition 3 Let X and Y be two normed linear spaces, T a subset of X, and
x∈ intX(T ) Let Φ and Ψ be two maps in YT which are Fréchet differentiable at x.Then, for any real number α, αΦ + Ψ is Fréchet differentiable at x, and
DαΦ+Ψ,x = αDΦ,x+ DΨ,x
Trang 14Proof By Proposition 2, there exist maps eΦ and eΨ in YT such that
Φ(ω)− Φ(x) = DΦ,x(ω− x) + eΦ(ω) and Ψ(ω)− Ψ(x) = DΨ,x(ω− x) + eΨ(ω)for all ω ∈ intX(T ), and limω→x eΦ (ω)
Exercise 19 LetX be a normed linear space, andO a nonempty open subset of X
Prove: Ifϕ, ψ ∈ RO are Fréchet differentiable atx∈ O, then the product operator
ϕψ is Fréchet differentiable atx, and
Dϕψ,x= ψ(x)D ϕ,x + ϕ(x)D ψ,x
The next result should again be familiar from ordinary calculus
Proposition 4 (The Chain Rule) Let X, Y and Z be normed linear spaces, and
O and U subsets of X and Y , respectively Let Φ ∈ TS and Ψ ∈ ZT are two mapssuch that Φ is Fréchet differentiable at x ∈ intX(S), and Ψ at Φ(x) ∈ intY(T ) Then,
Ψ◦ Φ is Fréchet differentiable at x, and
DΨ◦Φ,x= DΨ,Φ(x)◦ DΦ,x.Proof By Proposition 2, there exist maps eΦ ∈ YS and eΨ∈ ZT such that
Φ(ω)− Φ(x) = DΦ,x(ω− x) + eΦ(ω) for all ω ∈ intX(S) (9)and
Ψ(w)− Ψ(Φ(x)) = DΨ,Φ(x)(w− Φ(x)) + eΨ(w) for all w ∈ intY(T ),
e(ω) := DΨ,Φ(x)(eΦ(ω)) + eΨ(Φ(ω))
Trang 15By Proposition 2, therefore, it remains to show that limω→x ω−xe(ω) = 0 To this end,observe first that
limω→x
e Φ (ω) Yω−x = 0(Section J.4.3) The proof will thus be complete if we can establish that limω→xeΨ (Φ(ω))
0 This requires harder work Note first that
limω→x
Φ(ω)−Φ(x) Y
ω−x = 0,therefore, there exists an ε > 0 such that Φ(ω)−Φ(x) Y
DΦ 0 ,f(h) = 0 and DΦ i ,f(h) = i(f )i−1h, i = 1, , m
Now define Φ : C[0, 1] → R by
Φ(f ) :=mi=0ai
] 1 0
f (t)idt,
where a0, , am ∈ R Using Propositions 3 and 4 we can readily compute the Fréchetderivative of Φ Indeed, if L ∈ B(C[0, 1], R) is defined by L(f) :=U1
0 f (t)dt, then wehave
Φ =mi=0 ai(L◦ Φi)
Trang 16So, since DL,h= L for any h (why?), Propositions 3 and 4 yield
DΦ,f(g) =mi=0aiDL◦Φi,f(g) =mi=0aiL (DΦ i ,f(g)) =mi=0ai
] 1 0
if (t)i−1g(t)dt
∗ Exercise 20.H (The Hammerstein Operator ) Define the self-mapΦ onC[0, 1] by
Φ(f )(x) :=
] 1 0
θ(x, t)H(t, f (t))dt,
whereθ is a continuous real map on[0, 1]2 andH : R2 → Ris a continuous function such that∂2H and∂2∂2H are well-defined and continuous functions on R2 Use the Chain Rule to show thatΦis Fréchet differentiable, and compute DΦ
Exercise 21 Take any natural numbers n, m and k Let O and U be nonempty open subsets of Rn andRm, respectively For any givenx∈ O, let Φ : O→ U and
Ψ : U → Rk be two maps that are Fréchet differentiable atxandΦ(x), respectively Show thatΨ◦ Φis Fréchet differentiable at x,and
DΨ◦Φ,x(y) = JΦ(x)Ψ JxΦy for all y ∈ Rn, whereJx
Φ and JΦ(x)Ψ are the Jacobian matrices ofΦ(at x) andΨ (atΦ(x)), tively.
respec-Exercise 22 LetXbe a normed linear space,Oa nonempty open and convex subset
ofX,andΦ∈ RO a Fréchet differentiable map.Fix any distinctx, y ∈ O,and define
F : (0, 1) → Rby F (λ) := Φ(λx + (1− λ)y) Show thatF is differentiable, and
F (λ) = DΦ,λx+(1−λ)y(x− y)for all t∈ R and 0 < λ < 1
Exercise 23 LetXandY be two normed linear spaces, andϕa Fréchet differentiable real map on X× Y Fix any x ∈ X, and define ψ ∈ RY by ψ(y) := ϕ(x− y, y)
Use the Chain Rule to prove that ψ is differentiable and computeDψ
To outline a basic introduction to optimization theory, we also need to go through thenotion of the second Fréchet derivative of real functions Just as in ordinary calculus,the idea is to define this notion as the “derivative of the derivative.” Unfortunately,life gets a bit complicated here Recall that the Fréchet derivative Dϕ of a real function
ϕ defined on an open subset O of a normed linear space X is a function that maps
O into X∗ Therefore, the Fréchet derivative of Dϕ at x ∈ O is a continuous linearfunction that maps X into X∗ Put differently, the second Fréchet derivative of ϕ at
x is a member of B(X, X∗)
Trang 17Just in case you find this confusing, let us see how this situation compares withdifferentiating a function of the form ϕ : R2
→ R twice In calculus, by the derivative
of ϕ at x ∈ R2,we understand the gradient of ϕ, that is, the vector (∂1ϕ(x), ∂2ϕ(x))∈
R2 In turn, the second derivative of ϕ at x is the matrix Hx := [∂ijϕ(x)]2×2, where
by ∂ijϕ(x), we understand ∂i∂jϕ(x) (You might recall that this matrix is called theHessian of ϕ at x) Given Example 1.[3], it is only natural that the second Fréchetderivative of ϕ at x is the linear operator induced by the matrix Hx (i.e y → Hxy)and hence it is a linear function that maps R2
into R2 Since the dual of R2
is R2(Example J.7), this situation conforms perfectly with the outline of the previousparagraph
At any rate, things will get clearer below Let us first state the definition of thesecond Fréchet derivative of a real function formally
Dhilqlwlrq Let T be a nonempty subset of a normed linear space X, and ϕ : T → R
a Fréchet differentiable map For any given x ∈ intX(T ), if Dϕ :intX(T ) → X∗ isFréchet differentiable at x, then we say that ϕ is twice Fréchet differentiable at
x In this case, the second Fréchet derivative of ϕ at x, denoted by D2ϕ,x, is amember of B(X, X∗); we define
D2ϕ,x:= DD ϕ ,x
If O is a nonempty open subset of T , and if ϕ is twice Fréchet differentiable at every
x ∈ O, then we say that ϕ is twice Fréchet differentiable on O If O = intX(T )here, then ϕ is said to be twice Fréchet differentiable
The thing to get used to here is that D2
ϕ,x ∈ B(X, X∗), that is, D2
ϕ,x(y) ∈ X∗for each y ∈ X We should thus write D2
ϕ,x(y)(z) for the value of the linear tional D2ϕ,x(y)at z It is, however, customary to write D2ϕ,x(y, z)instead of D2ϕ,x(y)(z),thereby thinking of D2
func-ϕ,x as a function that maps X × X into R From this viewpoint,
D2
ϕ,x is a continuous bilinear functional on X × X (Exercise J.52).12
The following is an analogue (and an easy consequence) of Proposition 2 for thesecond Fréchet derivative of a real function
Proposition 5 Let X be a normed linear space, T a subset of X, x ∈ intX(T ), and
L∈ B(X, X∗).For any ϕ ∈ RT, L is the second Fréchet derivative of ϕ at x if, andonly if, there exists a continuous map e : T → X∗ such that
Dϕ,ω = Dϕ,x+ L(ω− x) + e(ω) for all ω ∈ intX(T ), and lim
ω→x
e(ω) ω−x = 0
12 This custom is fully justified, of course After all, B(X, X ∗ ) “is” the normed linear space of all continuous bilinear functionals on X × X, that is, these two spaces are linearly isometric (Recall Exercise J.63.)
Trang 18Exercise 24 Prove Proposition 5.
E{dpsoh 6 [1] Let I be an open interval, f ∈ RI
a differentiable map, and x ∈ I
If f is twice differentiable at x, then there is an error function e1 : I → R such that
f (ω) = f (x) + f (x)(ω− x) + e1(ω) for all ω ∈ I, and limω→x|ω−x|e1(ω) = 0 (Why?)Hence, by Example 1.[1],
Df,ω(t)− Df,x(t) = (f (ω)− f (x))t = f (x)(ω − x)t + e1(ω)t
for all ω ∈ I and t ∈ R We define L : R → R∗ and e : I → R∗ by L(u)(v) := f (x)uvand e(u)(v) := e1(u)v,respectively Then
Df,ω(t)− Df,x(t) = L(ω− x)(t) + e(ω)(t) for all ω ∈ I and t ∈ R,
and it follows from Proposition 5 that D2
f,x = L, that is,
D2f,x(u, v) = f (x)uv for all u, v ∈ R
Reminder Given any n ∈ N, let S be a subset of Rn with nonempty interior, and
ϕ∈ RS a map such that the jth partial derivative of ϕ (as a real map on intR2(S))exists For any x ∈ intR 2(S) and i, j = 1, , n, the number ∂ijϕ(x) := ∂i∂jϕ(x) isreferred to as a second-order partial derivative of ϕ at x If ∂ijϕ(x) exists foreach x ∈ intR 2(S), then we refer to the map x → ∂ijϕ(x) on intR2(S) as a second-order partial derivative of ϕ (Note A “folk” theorem of advanced calculus saysthat if ∂ijϕ and ∂jiϕare continuous maps (on intR2(S)), then ∂ijϕ = ∂jiϕ.)
[2] Given any n ∈ N, let O be an open subset of Rn, and take any Fréchetdifferentiable map ϕ ∈ RO
If ϕ is twice Fréchet differentiable at x ∈ O, then allsecond-order partial derivatives of ϕ at x exist, and we have
D2ϕ,x(u, v) =ni=1nj=1∂ijϕ(x)uivj for all u, v ∈ Rn.Thus, the second Fréchet derivative of ϕ at x is none other than the symmetricbilinear functional induced by the so-called Hessian matrix [∂ijϕ(x)]n×n of ϕ at x
Exercise 25 Prove the assertion made in Example 6.[2].
E{dpsoh 7 Let O be a nonempty open and convex subset of a normed linear space
X,and let x and y be two distinct points in O Take any twice Fréchet differentiablemap ϕ ∈ RO, and define F ∈ R(0,1) by
F (λ) := ϕ(λx + (1− λ)y)
Trang 19We wish to show that F is twice differentiable and compute F (Any guesses?)
By Exercise 22, F is differentiable, and we have
F (α) := Dϕ,αx+(1−α)y(x− y), 0 < α < 1 (12)Define G := F , fix any 0 < λ < 1, and let us agree to write ωα for αx + (1 − α)y forany 0 < α < 1 By Proposition 5, there exists a continuous map e : O → X∗ suchthat
Dϕ,ω = Dϕ,ω λ+ D2ϕ,ωλ(ω− ωλ) + e(ω) for all ω ∈ O,and limω→ωλ ω−ωe(ω)
λ = 0 Thus, for any α ∈ (0, 1)\{λ}, (12) gives
F (α)− F (λ) = Dϕ,ω α(x− y) − Dϕ,ωλ(x− y) = D2ϕ,ω λ(ωα− ωλ, x− y) + e(ωα)(x− y).Since ωα − ωλ = (α− λ)(x − y) and D2
ϕ,ω λ is a bilinear functional (on X × X), wemay divide both sides of this equation by α − λ to get
F (α)−F (λ) α−λ = D2ϕ,ωλ(x− y, x − y) +e(ωα )(x−y)
α−λfor any α ∈ (0, 1)\{λ} But since α → ωα is a continuous map from (0, 1) into X,
f (t)idt, wherea0, , am ∈ R Compute D2
ϕ,f for anyf ∈ C[0, 1]
We have now at hand a potent theory of differentiation which generalizes the classicaltheory There still remains one major difficulty, however Insofar as our basic defi-nition is concerned, we are unable to differentiate a map that is defined on a subset
of a normed linear space with no interior For instance, let S = {(a, 1) : 0 < a < 1},and define ϕ : S → R by ϕ(a, 1) := a2 What is the Fréchet derivative of ϕ? Well,
Trang 20since S has no interior in R2 our basic definition does not even allow us to posethis question That definition is based on the idea of perturbing (infinitesimally) agiven point in the interior of the domain of a map in any direction in the space, andanalyzing the behavior of the resulting difference-quotient In this example, becauseintR2(S) = ∅, we cannot proceed in this manner Indeed, the variations we considermust be horizontal in this case Put differently, if x ∈ S and we wish to study thedifference-quotient ϕ(ω)−ϕ(x)ω−x , the variation ω − x must belong to the linear subspace
R × {0} So, in this example, the Fréchet derivative of ϕ needs to be viewed as alinear functional from R × {0} into R (and not from R2
into R)
Apparently, there is an obvious way we can generalize the definition of the Fréchetderivative, and capture these sorts of examples All we need is to take the domain ofthe function to be differentiated as open in the affine manifold it generates Let usfirst give such sets a name
Dhilqlwlrq Let X be a normed linear space A subset S of X is said to berelatively open if |S| > 1 and S is open in aff (S).13
Now consider a real map ϕ whose domain S is relatively open in some normedlinear space Obviously, the difference-quotient ϕ(ω)−ϕ(x)ω−x makes sense iff ω ∈ S\{x}.Therefore, span(S − x) is the linear space that contains all possible variations about
x, or equivalently, aff (S) is the set of all possible directions of perturbing x (Recallthe example considered above.) We are, therefore, led to define the Fréchet derivative
of ϕ at x as a bounded linear functional on span(S − x)
Notation Let T be a subset of a normed linear space In what follows we denotethe interior of T in aff (T ) as T, that is,
T:= intaff (T )(T )
Thus, T is relatively open iff T = T Moreover, span(T
− z) = span(T − z) for any
z ∈ T,14 and if T is convex and T=∅, then T =ri(T ) (Proposition I.11)
Dhilqlwlrq Let X and Y be two normed linear spaces, and T a subset of X with
If T = {x}, then T is itself an affine manifold, and hence it equals its interior in itself But there
is no sequence in T \{x} that converges to x, so we cannot possibly define the Fréchet derivative of
a map on T This is the reason why I assume |T | > 1 here.
Trang 21where z ∈ T is arbitrary.16 For any x ∈ T,a map Φ : T → Y is said to be Fréchetdifferentiable at x if there is a continuous linear operator DΦ,x ∈ B(s(T ), Y ) suchthat
limω→x
Φ(ω)−Φ(x)−D Φ,x (ω−x)
The linear operator DΦ,x is called the Fréchet derivative of Φ at x
If Φ is Fréchet differentiable at every x ∈ T, then we say that Φ is Fréchetdifferentiable In this case, we say that Φ is continuously Fréchet differentiable
if the map DΦ : T
→ B(s(T ), Y ), defined by DΦ(x) := DΦ,x, is continuous
Dhilqlwlrq Let T be a subset of a normed linear space X with |T | > 1 and
T = ∅, and take a Fréchet differentiable map ϕ : T → R For any given x ∈ T,
if Dϕ : T
→ s(T )∗ is Fréchet differentiable at x (where s(T ) is defined by (13)), wesay that ϕ is twice Fréchet differentiable at x In this case, the second Fréchetderivative of ϕ at x, denoted by D2ϕ,x, is a member of B(s(T ), s(T )∗); we define
D2
ϕ,x := DD ϕ ,x If ϕ is twice Fréchet differentiable at every x ∈ T, then we say that
ϕis twice Fréchet differentiable
These definitions extend the ones given in Sections 1.3 and 1.6 After all, if T
is a subset of X with intX(T ) = ∅, then aff (T ) = X = s(T ) — yes? — so the twodefinitions become identical Moreover, most of our findings in the previous sectionsapply to the Fréchet derivatives of maps defined on any nonsingleton T ⊆ X with
T =∅ All we have to do is to apply those results on s(T ) as opposed to X.17 Forinstance, Proposition 2 becomes the following in this setup
Proposition 2∗ Let X and Y be two normed linear spaces, T a subset of X,
x ∈ T, and L ∈ B(s(T ), Y ), where s(T ) is defined by (13) For any Φ ∈ YT, L isthe Fréchet derivative of Φ at x if, and only if, there exists a continuous map e ∈ YTsuch that
Φ(ω) = Φ(x) + L(ω− x) + e(ω) for all ω ∈ T, and lim
ω→x
e(ω) ω−x = 0
Let us now go back to our silly little example in which the question was to findthe Fréchet derivative of ϕ : S → R where S = {(a, 1) : 0 < a < 1} and ϕ(a, 1) := a2
As S is relatively open, this question now makes sense Besides, it is very easy toanswer For any 0 < a < 1, Dϕ,(a,1) is a linear functional on R×{0}, and just asone would like to see, we have Dϕ,(a,1)(t, 0) = 2at for any t ∈ R Moreover, D2ϕ,(a,1)
16 For any given S ⊆ X, we have span(S − y) = span(S − z) for any y, z ∈ S Right?
17 There is one exception, however In the statement of the Chain Rule (Proposition 4), if we posit that O and U are relatively open, then we need the additional hypothesis that DΦ,x(s(O)) ⊆
DΨ,Φ(x)(s(U )) With this modification, the proof goes through verbatim.
Trang 22is a linear operator from R×{0} into the dual of R×{0} (which is R×{0} itself), orequivalently, a bilinear functional on (R×{0}) × (R×{0}) We have:
D2ϕ,(a,1)((u, 0), (v, 0)) = 2uv for all u, v ∈ R,
as you can easily check.18
Remark 1 Not only that the definitions we have given in this section reduce tothose of the earlier sections for maps defined on sets with nonempty interior, they arealso consistent with them in the following sense Let X and Y be two normed linearspaces, and S ⊆ X Suppose that Φ : S → Y is Fréchet differentiable at x ∈ intX(S).Then, for any T ⊆ S with |T | > 1 and x ∈ T,the map Φ|T is Fréchet differentiable
at x, and we have
DΦ|T,x= DΦ,x|s(T )
One way of stating the classical Mean Value Theorem is the following: Given adifferentiable real map f on an open interval I, for any a, b ∈ I with a < b, thereexists a c ∈ (a, b) such that f(b) − f(a) = f (c)(b − a) This fact extends readily toreal maps defined on a normed linear space
The Generalized Mean Value Theorem Let S be a relatively open and convexsubset of a normed linear space X, and ϕ ∈ RS a Fréchet differentiable map Then,for any distinct x, y ∈ S,
ϕ(x)− ϕ(y) = Dϕ,z(x− y)for some z ∈ co{x, y}\{x, y}
Notice that the entire action takes place on the line segment co{x, y} here tuitively speaking, then, we should be able to prove this result simply by applyingthe good ol’ Mean Value Theorem on this line segment The following elementaryobservation, which generalizes the result found in Exercise 22, is a means to this end
In-18 Quiz Compute Dϕ,(1 ,1) and D 2
ϕ,( 1 ,1) , assuming this time that S = {(a, 3
2 − a) : 0 < a < 3
2 } (Hint The domain of Dϕ,(1 ,1) is {(t, −t) : t ∈ R}.)
Trang 23Lemma 1 Let S be a relatively open and convex subset of a normed linear space X,
x and y distinct points in S, and ϕ ∈ RS
a Fréchet differentiable map If F ∈ R(0,1)
is defined by F (λ) := ϕ(λx + (1 − λ)y), then F is differentiable, and
Proof of the Generalized Mean Value Theorem Fix any distinct x, y ∈ S, anddefine F : [0, 1] → R by F (λ) := ϕ(λx + (1 − λ)y) Being the composition of twocontinuous functions, F is continuous (Yes?) Moreover, by Lemma 1, F |(0,1) isdifferentiable, and F (λ) = Dϕ,λx+(1−λ)y(x− y) for any 0 < λ < 1 So, applying the
Exercise 27 Let X be a preordered normed linear space Let O be a nonempty open and convex subset ofX, and ϕ∈ RO a Fréchet differentiable map Show that
if Dϕ,x is a positive linear functional for any x ∈ O, then ϕ is increasing (that is,
ϕ(x)≥ ϕ(y) for any x, y ∈ O withx− y ∈ X+.)
Exercise 28 For any n ∈ N, let S be a nonempty compact and convex subset of
Rn such that intRn(S) =∅ Prove: Ifϕ∈ C(S) is a Fréchet differentiable function such that ϕ(x) = 0 for all x∈ bdR n(S), then there is an x∗ ∈ intR n(S) such that
Dϕ,x ∗ is the zero functional.
Warning The Mean Value Theorem (indeed, Rolle’s Theorem) cannot be extended
to the context of vector calculus without substantial modification For instance, inthe case of the map Φ : R → R2 defined by Φ(t) := (sin t, cos t), we have Φ(0) = Φ(2π)but DΦ,x(t) = (0, 0) for any 0 ≤ x ≤ 2π and t ∈ R (Check!)
The following generalization of the Second Mean Value Theorem is also worthnoting It will prove very handy in Section 3 when we look into the properties ofdifferentiable concave functionals
The Generalized Second Mean Value Theorem Let S be a relatively openand convex subset of a normed linear space X, and ϕ ∈ RS a continuously Fréchet
Trang 24differentiable map If ϕ is twice Fréchet differentiable,19 then, for any distinct x, y ∈S,
ϕ(x)− ϕ(y) = Dϕ,y(x− y) +12D2ϕ,z(x− y, x − y)for some z ∈ co{x, y}\{x, y}
Exercise 29 H Prove the Generalized Second Mean Value Theorem.
Exercise 30 (A Taylor’s Formula with Remainder ) LetO be a nonempty open and convex subset of a normed linear space X, and ϕ ∈ RO a continuously Fréchet differentiable map which is also twice Fréchet differentiable Show that, for each
x∈ X,there is a (remainder) functionr∈ RX such that
ϕ(x + z) − ϕ(x) = D ϕ,x (z) +12D2ϕ,x(z, z) + r(z) for any z ∈ X withx + z ∈ O,and limz→0 r(z)z 2 = 0
E{dpsoh 8 Let O be a nonempty open subset of R2,and take any twice continuouslydifferentiable map ϕ : O → R One can show that ϕ is not only continuously Fréchetdifferentiable, but it is also twice Fréchet differentiable.20 Moreover, as we notedearlier, a “folk” theorem of multivariate calculus says that ∂12ϕ(ω) = ∂21ϕ(ω)for any
ω∈ O.21 Consequently, Example 6.[2] yields
D2ϕ,z(u, u) = ∂11ϕ(z)u21+ 2∂12ϕ(z)u1u2+ ∂22ϕ(z)u22for any z ∈ O and u ∈ R2 Combining this with the Generalized Second Mean ValueTheorem and Example 1.[3], therefore, we obtain the following result of advancedcalculus: For every distinct x, y ∈ O, there exists a z ∈ co{x, y}\{x, y} such that
ϕ(x)− ϕ(y) = ∂1ϕ(y)(x1 − y1) + ∂2ϕ(y)(x2− y2) + E(z)where
E(z) := 12
∂11ϕ(z)(x1− y1)2+ 2∂12ϕ(z)(x1− y1)(x2− y2) + ∂22ϕ(z)(x2− y2)2
19 It may seem like there is a redundancy in the hypotheses here, but in fact this is not the case Twice Fréchet differentiability of ϕ does not, in general, imply its continuous Fréchet differentiability.
20 While this may look like quite a bit to swallow, the proof is hidden in Exercises 9 and 10.
21 You have surely seen this fact before (An Hessian matrix is always symmetric, no?) Its proof, while a bit tedious, follows basically from the definitions (but note that the assumption of ∂12 (or
∂21) being continuous is essential).
Trang 252.2 The Mean Value Inequality
We noted above that the Generalized Mean Value Theorem need not apply to mapsthat are not real-valued It turns out that this is not a major difficulty Indeed,most of the results of one-variable calculus that can be deduced from the Mean ValueTheorem can also be obtained by using the so-called Mean Value Inequality: If O is anopen subset of R that contains the open interval (a, b), and f ∈ RO is differentiable,then
f (b)− f(a) ≤ sup {|f (t)| : t ∈ O} (b − a)
It turns out that this result extends nicely to the present framework, and this sion is all one needs for most purposes.22
exten-The Mean Value Inequality Let X and Y be two normed linear spaces, and O anonempty open subset of X Let Φ ∈ YO be Fréchet differentiable Then, for every
x, y∈ O with co{x, y} ⊆ O, there exists a real number K ≥ 0 such that
Lemma 2 If x, y and z are any points in a normed linear space with z ∈ co{x, y},then
x− y = x − z + z − y Proof By hypothesis, there exists a 0 ≤ λ ≤ 1 such that z = λx + (1 − λ)y Then
x− z = (1 − λ)(x − y) and z − y = λ(x − y) Thus
x− z + z − y = (1 − λ) x − y + λ x − y = x − y ,
22 For simplicity we state this result for maps defined on open sets, but it is straightforward to extend it to the case of maps defined on relatively open sets.
Trang 26as we sought Proof of the Mean Value Inequality Fix any x, y ∈ O with co{x, y} ⊆ O, andtake any K ≥ 0 that satisfies (15).23 Towards deriving a contradiction, suppose thatthere exists an ε > 0 such that
Φ(x)− Φ(y) Y > (K + ε) x− y Let x0 := x and y0 := y Since, by subadditivity of · Y ,
− y1 Proceeding this way inductively, we obtain two sequences (xm)and (ym) in co{x, y} and a vector z ∈ co{x, y} with the following properties: For all
m∈ N,
(i) z ∈ co{xm, ym
},(ii) lim xm = z = lim ym,
(iii) Φ(xm)− Φ(ym) Y > (K + ε) xm− ym
(Verify this carefully.24)
Since Φ is Fréchet differentiable at z, there is a δ > 0 such that Nδ,X(z)⊆ O and
23 Quiz How am I so sure that such a real number K exists?
24 I am implicitly invoking Lemma 2 here along with Cantor’s Nested Interval Lemma Please make sure I’m not overlooking anything.
Trang 27Recall that a major corollary of the Mean Value Theorem is the fact that a realfunction whose derivative vanishes at an open interval must be constant on thatinterval The Mean Value Inequality yields the following generalization of this fact.
Corollary 1 Let X and Y be two normed linear spaces, and O a nonempty openand convex subset of X If Φ ∈ YO is Fréchet differentiable and DΦ,x is the zerooperator for each x ∈ O, then there is a y ∈ Y such that Φ(x) = y for all x ∈ O
Exercise 31 Prove Corollary 1.
Exercise 32.H Show that the term “convex” can be replaced with “connected” in Corollary 1.
Exercise 33.H Let X and Y be two normed linear spaces, O a nonempty open and convex subset ofX,andx, y ∈ O.Show that ifΦ∈ YO is Fréchet differentiable and
xo ∈ O,then
Φ(x)− Φ(y) − DΦ,x o(x− y) Y ≤ K x − y
for any K ≥ sup { DΦ,w− DΦ,x o
∗ : w∈ co{x, y}}
You might recall that a concave function f defined on an open interval I possesses verynice differentiability properties In particular, any such f is differentiable everywhere
on I but countably many points While our main goal is to study those concavefunctions that are defined on convex subsets of an arbitrary normed linear space, itmay still be a good idea to warm up by sketching a quick proof of this elementaryfact
E{dpsoh 9 Let I be an open interval and f ∈ RI a concave map It is easy
to check that the right-derivative f+ of f is a well-defined and decreasing function
on I (Prove!) Thus if (xm) ∈ I∞ is a decreasing sequence with xm x, we havelim f+(xm)≤ f+(x) On the other hand, concavity implies that
f+(xm)≥ f (y)−f(xm )
y−x m for all y ∈ I with y > xm,for each m, so, since f is continuous (Corollary A.2), we get
limm→∞f+(xm)≥ f (y)−f(x)y−x for all y ∈ I with y > x
Trang 28In turn, this implies lim f+(xm) ≥ f+(x), so we find lim f+(xm) = f+(x) But theanalogous reasoning would yield this equation if (xm) was an increasing sequencewith xm x (Yes?) Conclusion: f is differentiable at x iff f+ is continuous at x.But since f+ is a monotonic function, it can have at most countably many points
of discontinuity (Exercise B.8) Therefore, f is differentiable everywhere on I but
Exercise 34 LetI be an open interval and f ∈ RI a concave map Show that there exists a countable subsetS ofI such thatf ∈ C(I\S)
Unfortunately, we are confronted with various difficulties in higher dimensions.For instance, the map ϕ : R2
→ R defined by ϕ(u, v) := − |u| is a concave functionwhich is not differentiable on the uncountable set R × {0} Worse still, a concavefunction on an infinite-dimensional normed linear space may not possess a Fréchetderivative anywhere! For example, ϕ : 1
→ R defined by ϕ((xm)) := − (xm) 1 isnot Fréchet differentiable at any point in its domain (Exercise 17) Nevertheless, theconcave functions that arise in most economic applications are in fact continuouslydifferentiable, and hence for most practical purposes, these observations are not tooproblematic
∗Remark 2 Just in case the comments above sounded to you overly dramatic, wemention here, without proof, two positive results about the Fréchet differentiability
of concave maps
(a) Take any n ∈ N, and let us agree to say that a set S in Rn is null if, for all
ε > 0,there exist countably many n-cubes such that (i) S is contained in the union ofthese cubes, and (ii) the sum of the side lengths of these cubes is at most ε (RecallSection D.1.4.) One can show that if O is an open and convex subset of Rn and ϕ is
a concave map on O, then the set of all points x ∈ O at which ϕ fails to be Fréchetdifferentiable is null
(b) Asplund’s Theorem If X is a Banach space such that X∗ is separable, and ϕ
is a continuous and concave real function defined on an open and convex subset O of
X, then the set of all points at which ϕ is Fréchet differentiable is dense in O.25
Exercise 35.H LetO be a nonempty open and convex subset of a normed linear space, and ϕ∈ RO.We say that an affine map ϑ : X → R is a support of ϕ at x∈ O
if ϑ|O ≥ ϕ and ϑ(x) = ϕ(x) Show that if ϕ is a concave map which is Fréchet differentiable atx∈ O,then ϕhas a unique support at x.26
25 Asplund’s Theorem is in fact more general than this It says that the set of all points at which
ϕ is Fréchet differentiable is a countable intersection of open and dense subsets of O The easiest proof of this result that I know is the one given by Preiss and Zajicek (1984) who in fact prove something even more general.
26 Warning The converse of this statement is false in general, but it it is true when X is a Euclidean space.
Trang 293.2 Fréchet Differentiable Concave Maps
Recall that there are various useful ways of characterizing the concavity of entiable real maps defined on an open interval It is more than likely that you arefamiliar with the fact that if I is an open interval and ϕ ∈ RI is differentiable andconcave, then the difference-quotient ϕ(y)−ϕ(x)y−x is less than ϕ (x) if y > x, while itexceeds ϕ (x) if x > y (Draw a picture.) Put more concisely,
differ-ϕ(y)− ϕ(x) ≤ ϕ (x)(y − x) for all x, y ∈ I
A useful consequence of this observation is that a differentiable ϕ ∈ RI is concave iffits derivative is a decreasing function on I We now extend these facts to the context
of real maps defined on open and convex subsets of an arbitrary normed linear space.Proposition 6 Let S be a relatively open and convex subset of a normed linearspace, and ϕ ∈ RS a Fréchet differentiable map Then, ϕ is concave if, and only if,
ϕ(y)− ϕ(x) ≤ Dϕ,x(y− x) for all x, y ∈ S (16)Proof Suppose that ϕ is concave, and take any x, y ∈ S Then,
ϕ(x + λ(y− x)) = ϕ((1 − λ)x + λy) ≥ (1 − λ)ϕ(x) + λϕ(y)
for all 0 ≤ λ ≤ 1 Thus
1
λ (ϕ(x + λ(y− x)) − ϕ(x) − Dϕ,x(λ(y− x))) ≥ ϕ(y) − ϕ(x) − Dϕ,x(y− x)for each 0 < λ < 1 We obtain (16) by letting λ → 0 in this expression, and using thedefinition of Dϕ,x.27
Conversely, suppose (16) is true Take any x, y ∈ S and 0 < λ < 1 If we set
z := λx + (1− λ)y, then λ(x − z) + (1 − λ)(y − z) = 0, so (16) implies
ϕ(z) = λϕ(z) + (1− λ)ϕ(z) + Dϕ,z(λ(x− z) + (1 − λ)(y − z))
= λ(ϕ(z) + Dϕ,z(x− z)) + (1 − λ)(ϕ(z) + Dϕ,z(y − z))
≥ λϕ(x) + (1 − λ)ϕ(y),
Corollary 2 Let S be a relatively open and convex subset of a normed linear space
X, and ϕ ∈ RS a Fréchet differentiable map Then, ϕ is concave if, and only if,
(Dϕ,y− Dϕ,x) (y− x) ≤ 0 for all x, y ∈ S (17)
27 Notice that I used here the Fréchet differentiability of ϕ only at x Thus: ϕ(y) − ϕ(x) ≤
D ϕ,x (y − x) holds for any y ∈ S and concave ϕ ∈ R S , provided that D ϕ,x exists.
Trang 30Exercise 36 Prove Corollary 2.
Exercise 37 Verify that in Proposition 6 and Corollary 2 we can replace the term
“concave” with “strictly concave,” provided that we replace “≤” with “<” in (16) and (17).
Exercise 38.H Given any n ∈ N, let O be a nonempty open and convex subset of
Rn, and assume thatϕ∈ RO is a map such that ∂iϕ(x)exists for each x ∈ O and
i = 1, , n Show that ifϕ is concave,then
ϕ(y) − ϕ(x) ≤ni=1 ∂iϕ(x)(yi− x i ) for all x, y ∈ O (18) Conversely, if (18) holds, and each ∂iϕ(·) is a continuous function on O, then ϕ is concave.
If X = R in Corollary 2, then Dϕ,z(t) = ϕ (z)t for all t ∈ R and z ∈ S, so (17)reduces to (ϕ (y) − ϕ (x))(y − x) ≤ 0 for all x, y ∈ S So, a very special case ofCorollary 2 is the well-known fact that a differentiable real map on an open interval
is concave iff its derivative is decreasing on that interval But a differentiable realfunction on a given open interval is decreasing iff its derivative is not strictly positiveanywhere on that interval It follows that a twice differentiable real map on an openinterval is concave iff the second derivative of that map is less than zero everywhere.The following result generalizes this observation
Corollary 3 Let S be a relatively open and convex subset of a normed linear space
X,and ϕ ∈ RS a continuously Fréchet differentiable map which is, also, twice Fréchetdifferentiable Then, ϕ is concave if, and only if,
D2ϕ,x(z, z)≤ 0 for all (x, z) ∈ S × X (19)Proof If (19) is true, then, by the Generalized Second Mean Value Theorem, wehave ϕ(x) −ϕ(y) ≤ Dϕ,y(x−y) for any x, y ∈ S, so the claim follows from Proposition
6 Conversely, suppose ϕ is concave, and take any x ∈ S and z ∈ span(S−x) Observethat x + λz ∈ x+ span(S − x) = aff (S), for any λ ∈ R So, since S is relatively openand x ∈ S, we can choose a δ > 0 small enough so that (i) x + λz ∈ S for all
−δ < λ < δ; and (ii) the map f ∈ R(−δ,δ) defined by f (λ) := ϕ(x + λz) is concave
so that f (0) ≤ 0 (Right?) But we have f (λ) = D2ϕ,x+λz(z, z) for any −δ < λ < δ.(Yes?) Hence, f (0) ≤ 0 implies D2
Exercise 39 Show that, under the conditions of Corollary 3, ifD2
ϕ,x(z, z) < 0for all
(x, z)∈ S × X,then ϕis strictly concave, but not conversely.
You of course know that the derivative of a differentiable real map need not becontinuous (That is, a differentiable real function need not be continuously dif-ferentiable.) One of the remarkable properties of concave (and hence convex) real