Real Analysis with Economic Applications - Chapter K docx

Thus, it makes 4 “In the classical teaching of calculus, this idea is immediately obscured by the accidental fact that, on a one-dimensional vector space, there is a one-to-one correspon

Trang 1

Chapter K

Diﬀerential Calculus

In the second half of this book, starting from Chapter F, we have worked on developing

a thorough understanding of function spaces, may it be from a geometric or analyticviewpoint This work allows us to move towards a variety of directions In particular,

we can now extend the classical diﬀerential calculus methods to the realm of mapsdefined on suitable function spaces, or more generally, on normed linear spaces Inturn, this “generalized” calculus can be used to develop a theory of optimization

in which the choice objects need not be n-vectors, but members of an arbitrarynormed linear space (as in calculus of variations, control theory and/or dynamicprogramming) This task is carried out in the present chapter

We begin with a quick retake on the notion of derivative (of a real-to-real function),and point to the fact that there are advantages of viewing this notion as a particularlinear functional, as opposed to a number Once this point is understood, it becomesstraightforward to extend the notion of “derivative” to the context of functions whosedomains and codomains lie within arbitrary normed linear spaces Moreover, theresulting derivative concept, called the Fréchet derivative, inherits many properties

of the derivative that you are duly familiar with from classical calculus We study thisconcept in fair detail here, go through several examples, and extend some well-knownresults of calculus to this realm, such as the Chain Rule, the Mean Value Theorem,etc Keeping an eye on optimization theoretic applications, we also revisit the theory

of concave functions, this time making use of Fréchet derivatives.1

The “use” of this work is demonstrated by means of a brief introduction to (infinitedimensional) optimization theory Here we see how one can easily extend first- andsecond-order conditions for local extremum of real functions on the line to the broadercontext of real maps on normed linear spaces We also show how useful concavefunctions are in this context as well As for an application, and as a final of business inthis text, we sketch a precursory, but rigorous, introduction to calculus of variations,and consider a few of its economic applications.2

1 For reasons that largely escape me, most texts on functional analysis do not cover differential calculus on normed linear spaces A thorough treatment of this topic can be found in Dieudonné (1969), but you may well find that exposition a little “heavy.” Some texts on optimization theory (such as Luenberger (1969)) do contain a discussion of Fréchet differentiation, but rarely develop the theory to the extent that we do here The best reference I know about differential calculus on normed linear spaces is a little book by Cartan (1972) That book is unfortunately out of print at present — I was able to read it, I’m proud to say, from its Spanish edition — but if you can get a hold

of it, you would be in for a treat.

2 Due to space constraints, I don’t go into control theory here even though this is among the dard methods of dynamic economic analysis in continuous time To a careful eye, it will be evident that the machinery developed here can as well be used to go deep into constrained optimization over Banach spaces, from which point control theory is within a stone’s throw.

Trang 2

stan-1 Fréchet Diﬀerentiation

So far in this text we have worked almost exclusively with limits of sequences Itwill be convenient to depart from this practice in this chapter, and work instead withlimits of functions The basic idea is a straightforward generalization of one that youare surely familiar with from calculus (Section A.4.2)

Let T be a nonempty subset of a normed linear space X (whose norm is · ), and

Φ a function that maps T into a normed linear space Y (whose norm is · Y) Let

x∈ X be the limit of at least one sequence in T \{x}.3

A point y ∈ Y is said to be the limit of Φ at x, in which case we say “Φ(ω)approaches to y as ω → x, ” provided that Φ(xm)→ y holds for every sequence (xm)

in T \{x} with xm → x This situation is denoted as

limω→xΦ(ω) = y

Clearly, we have limω→xΦ(ω) = y iﬀ, for each ε > 0, there exists a δ > 0 such that

y− Φ(ω) Y < ε for all ω ∈ T \{x} with ω − x < δ (Yes?) Thus,

limω→xΦ(ω) = y iﬀ lim

ω→xΦ1(ω) + lim

ω→xΦ2(ω)for any α ∈ R If Y = R, and limω→xΦ1(ω)and limω→xΦ2(ω) are real numbers, then

we also have

limω→xΦ1(ω)Φ2(ω) = lim

Φ 1 (ω)−Φ 2 (ω) ω−x = 0

So, if Φ1 and Φ2 are tangent at x, then not only that Φ1(x) = Φ2(x), but also, as

ω → x, the distance between the values of these functions (i.e Φ1(ω)− Φ2(ω) Y)converges to 0 “faster” than ω approaches to x Put another way, near x, the values

of Φ2 approximates Φ1(x) better than ω approximates x from the same distance Inthis sense, we can think of Φ2 as a “best approximation” of Φ1 near x

3 Of course, if x ∈ int X (T ), this condition is automatically satisfied (Yes?)

Trang 3

1.2 What is a Derivative?

In calculus one is taught that the derivative of a function f : R → R at a point x is

a “number” that describes the rate of “instantaneous” change of the value of f as

x changes Yet, while useful for certain applications, this way of looking at thingsfalls short of reflecting the intimate connection between the notion of the derivative

of f at x and “the line that best approximates f near x.” We begin our discussion

by recalling this interpretation.4

Let O be an open subset of R, x ∈ O, and f ∈ RO Suppose f is diﬀerentiable at

x, that is, there is a real number f (x) with

f (t)−f(x)−f (x)(t−x)

|t−x| = 0

Put diﬀerently, if f is diﬀerentiable at x, then there exists a linear functional L on

R, namely, a → f (x)a, such that the aﬃne map t → f(x) + L(t − x) is tangent to

f at x The converse is also true Indeed, if L : R → R is a linear functional suchthat t → f(x) + L(t − x) is tangent to f at x, then, there exists an α ∈ R such thatL(a) = αa for all a ∈ R, and

limt→x

f (t)−f(x) t−x − α

= limt→x

f (t)−f(x)−α(t−x)

It follows that f is diﬀerentiable at x, and f (x) = α

This elementary argument establishes that f is diﬀerentiable at x iﬀ there is alinear functional L on R such that

limt→x

f (t)−f(x)−L(t−x)

that is, the aﬃne map t → f(x) + L(t − x) is tangent to f at x — it approximates faround x so well that, as t → x, the error f(t) − f(x) − L(t − x) of this approximationdecreases to 0 faster than t tends to x

∗ ∗ ∗ ∗ FIGURE K.1 ABOUT HERE ∗ ∗ ∗ ∗This is the key idea behind the very concept of diﬀerentiation: The local behavior

of a diﬀerentiable function is linear just like that of an aﬃne map Thus, it makes

4 “In the classical teaching of calculus, this idea is immediately obscured by the accidental fact that, on a one-dimensional vector space, there is a one-to-one correspondence between linear functionals and numbers, and therefore the derivative at a point is defined as a number instead of a linear functional This slavish subservience to the shibboleth of numerical interpretation at any cost becomes much worse when dealing with functions of several variables ” Dieudonné (1968, p.147).

Trang 4

sense to consider the linear functional L of (3) as central to the notion of derivative

of f at x In fact, it would be more honest to refer to L itself as the “derivative” of f

at x From this point of view, the number f (x) is simply the slope of L — it is noneother than a convenient way of identifying this linear functional

Okay, what’s the big deal? It may seem like we are fighting over semantics here.What diﬀerence does it make if we instead viewed the linear functional t → f (x)t

as the derivative of f at x, instead of the number f (x)? Well, think about it Howwould you define the derivative of f at x if this function is defined on an open subset

O of R2? The classical definition (2), which formalizes the notion of “rate of change,”immediately runs into diﬃculties in this situation But the idea of “finding an aﬃnemap which is tangent to f at x” survives with no trouble whatsoever All we have

to do is to define the derivative of f at x as the linear functional L on R2 with theproperty that

limt→x

f (t)−f(x)−L(t−x) t−x 2 = 0

Geometrically speaking, the graph of the aﬃne map t → f(x) +L(t− x) is none otherthan the hyperplane tangent to the graph of f at x (Figure 2)

∗ ∗ ∗ ∗ FIGURE K.2 ABOUT HERE ∗ ∗ ∗ ∗Looking at the derivative of a function the “right” way saves the day in manyother circumstances Since the notion of tangency is well-defined for functions thatmap a normed linear space into another, this point of view remains meaningful forany such function, and paves the way toward the general theory we are about to layout.5

Dhilqlwlrq Let X and Y be two normed linear spaces, and T a subset of X Forany x ∈ intX(T ), a map Φ : T → Y is said to be Fréchet diﬀerentiable at x ifthere is a continuous linear operator DΦ,x ∈ B(X, Y ) such that

limω→x

Φ(ω)−Φ(x)−D Φ,x (ω−x)

The linear operator DΦ,x is called the Fréchet derivative of Φ at x.6

5 Part of this theory can be developed within the context of metric linear spaces as well However, I will work exclusively with normed linear spaces in this chapter, as the applied strength of diﬀerential calculus on metric linear spaces is nowhere near that on normed linear spaces.

6 Just to be on the safe side, let me remind you that 0 in (4) is the origin of Y Put diﬀerently, (4) means

lim

ω →x Φ(ω) −Φ(x)−D Φ,x (ω −x) Y

Trang 5

If O is a nonempty open subset of T , and if Φ is Fréchet diﬀerentiable at every

x ∈ O, then we say that Φ is Fréchet differentiable on O If O = intX(T ) here,then Φ is said to be Fréchet differentiable Finally, we say that Φ is continuouslyFréchet differentiable if it is Fréchet differentiable and the map DΦ :intX(T ) →B(X, Y ), defined by DΦ(x) := DΦ,x, is continuous

The idea should be clear at this point Just as in plain ol’ calculus, we perturb

x∈ intX(T ) infinitesimally, and look at the behavior of the diﬀerence-quotient of Φ

Of course, here we can perturb x in all sorts of diﬀerent ways Indeed, since intX(T )

is open in X, any ω ∈ X can be thought of as a perturbation of x, provided that

ω− x is small enough Thus, the local linear behavior of Φ at x must be captured

by a linear operator defined on the entire X The Fréchet derivative of Φ at x is,then, a linear operator DΦ,x defined on X, and one that ensures that the (globallylinear) behavior of the aﬃne map Φ(x)+DΦ,x(ω−x) approximates the (locally linear)behavior of Φ at x accurately (See Proposition 2 below.)

Before we jump into examples, there are a few matters to clarify First, we need

to justify why we call DΦ,x “the” Fréchet derivative of Φ at x How do we know thatthere is a unique DΦ,x in B(X, Y ) that satisfies (4)? To settle this matter, take anytwo K, L ∈ B(X, Y ) and suppose that (4) holds both with DΦ,x = K and DΦ,x = L

We must then have

limω→x

(K−L)(ω−x)

(Yes?) Since intX(T ) is open, this is equivalent to say that

limν→0

(K−L)(ν)

ν = 0

(Why?7) It follows that

0= limm→∞

(K−L)(m1y)

1

m y = (K−L)(y)y for all y ∈ X,

so we have K = L Conclusion: When it exists, the Fréchet derivative of a function

at any given point in the interior of its domain is unique

The second issue we should discuss is why we define the Fréchet derivative of afunction at a point as a continuous linear operator Intuitively speaking, the mainreason for this is that the notion of tangency makes geometric sense only when thefunctions involved are continuous at the point of tangency So, at least at the pointthat we wish to define the derivative of the function we better ask the linear operator

that is, for every ε > 0, there exists a δ > 0 such that Φ(ω) − Φ(x) − D Φ,x (ω − x) Y ≤ ε ω − x for each ω ∈ N δ,X (x).

7 Be careful here If x did not belong to the interior of T (in X), the former equation would not imply the latter.

Trang 6

we seek to be continuous But, of course, this is the same thing as asking for the tinuity of that operator everywhere (why?), and hence we ask the Fréchet derivative

con-of a function at a point to be continuous.8 By the way, as a major side benefit of thisrequirement, we are able to maintain the familiar rule

diﬀerentiability =⇒ continuity

Proposition 1 Let X and Y be two normed linear spaces, T a subset of X, and

x∈ intX(T ) If Φ ∈ YT is Fréchet diﬀerentiable at x, then Φ is continuous at x.Proof Take any Φ ∈ YT such that (4) holds for some DΦ,x∈ B(X, Y ) Then

limω→x(Φ(ω)− Φ(x) − DΦ,x(ω− x)) = 0

Since DΦ,x is continuous, we have limω→xDΦ,x(ω− x) = DΦ,x(0) = 0 It follows that

Finally, let us mention two alternate formulations of the definition of the Fréchetderivative at a given point x in the interior of the domain of a map Φ ∈ YT Imme-diate in this regard is the observation that, by changing variables, we can write (4)equivalently as

limν→0

Φ(x+ν)−Φ(x)−D Φ,x (ν)

(Yes?) When convenient, this formulation can be used instead of (4)

Our second reformulation stems from the fact that, just as in the case of variable calculus, the derivative notion that we consider here corresponds to a bestlocal approximation of a given function This is not entirely trivial, so we state it inprecise terms

one-Proposition 2 Let X and Y be two normed linear spaces, T a subset of X,

x∈ intX(T ), and L ∈ B(X, Y ) For any Φ ∈ YT, L is the Fréchet derivative of Φ at

x if, and only if, there exists a continuous map e ∈ YT such that

Φ(ω) = Φ(x) + L(ω− x) + e(ω) for all ω ∈ intX(T ), and lim

ω→x

e(ω) ω−x = 0

8 This issue never arises in classical calculus, as any linear operator from a Euclidean space into another is, per force, continuous.

9 Just so that we’re on safe grounds here, let me note that I got limω→xΦ(ω) = Φ(x) from lim

ω →x (Φ(ω) − Φ(x)) = limω

→x (Φ(ω) − Φ(x) − D Φ,x (ω − x)) + limω

→x DΦ,x(ω − x) = 0.

Trang 7

So, if Φ ∈ YT is Fréchet diﬀerentiable at x ∈ intX(T ), then the aﬃne map ω →Φ(x) + DΦ,x(ω− x) is a best approximation of Φ at x in the sense that, as ω → x, the

“size of the error” involved in this approximation, that is e(ω) Y , vanishes fasterthan ω − x goes to zero

Exercise 1 Prove Proposition 2.

Exercise 2 For any two normed linear spaces X and Y, show that DL,x = Lfor anyL∈ B(X, Y )andx∈ X.(That is, the Fréchet derivative of a linear operator at any point in its domain equals the linear operator itself Wouldn’t you expect this?)

continuous bilinear functional (Exercise J.52) Show thatϕis Fréchet diﬀerentiable, and, for any(x∗, y∗)∈ X × Y,

Dϕ,(x,y)(z, w) = ϕ(z, y∗) + ϕ(x∗, w).

Exercise 4 Let X and Y be two normed linear spaces, O a nonempty open subset

ofX, andΦ∈ YO a Fréchet diﬀerentiable map Show thatΦwould remain Fréchet diﬀerentiable if we replaced the norms ofX andY by equivalent norms, respectively (Recall Section J.4.2.) Moreover, the Fréchet derivative of Φ would be the same in both cases.

Exercise 5 (The Gateaux Derivative) Let X and Y be two normed linear spaces and x∈ X The map Φ∈ YX is said to be Gateaux diﬀerentiable at xif there exists an L∈ B(X, Y )such that

HereLis called the Gateaux derivative of Φatx.(The idea is the generalization

of that behind the notion of directional derivatives.)

(a) Show that, when it exists, the Gateaux derivative ofΦ atxis unique.

(b) Prove: If Φ is Fréchet diﬀerentiable at x, then it is Gateaux diﬀerentiable atx,

and its Gateaux derivative at xequals DΦ,x

Trang 8

E{dpsoh 1 [1] Let I be an open interval and x ∈ I The Fréchet derivative of adiﬀerentiable function f ∈ RI

at x is the linear map t → f (x)t on R, that is,

Df,x(t) = f (x)t for all t ∈ R

Thus the Fréchet derivative of f at x is exactly what we argued in Section 1.2 the

“derivative of f at x” should mean The number f (x) serves only to identify thelinear map a certain shift of which gives us a best approximation of f near x

We observe here that any diﬀerentiable function f ∈ RI is Fréchet diﬀerentiable,and Df : I → B(R, R) satisfies

Df(x)(t) = f (x)t for all x ∈ I and t ∈ R

[2]Take any m ∈ N and any open interval I If Φ : I → Rmis Fréchet diﬀerentiable

at x ∈ I, then DΦ,x is a linear operator from R into Rm given by

DΦ,x(t) = (Φ1(x)t, , Φm(x)t) for all − ∞ < t < ∞,where Φi ∈ RI is the ith component map of Φ, i = 1, , m This is a special case of

a result we shall prove shortly

Reminder Given any n ∈ N, let S be a subset of Rn with nonempty interior.Where ej

denotes the jth unit vector in Rn

, the jth partial derivative of ϕ ∈ RS

[3] Let n ∈ N and take any open subset O of Rn

As we show below, if x ∈ O and

ϕ∈ RO is Fréchet diﬀerentiable at x, then all partial derivatives of ϕ at x exist, and

we have

Dϕ,x(t1, , tn) =nj=1∂jϕ(x)tj for all (t1, , tn)∈ Rn.Thus, the Fréchet derivative of ϕ at x is none other than the linear functional thatcorresponds to the n-vector (∂1ϕ(x), , ∂nϕ(x)), which, as you know, is called thegradientof ϕ at x

[4] We now generalize Given any m, n ∈ N, let O be an open subset of Rn, andfix any x ∈ O Take any Φi ∈ RO

, i = 1, , m, and define Φ : O → Rm byΦ(t1, , tn) := (Φ1(t1, , tn), , Φm(t1, , tn))

(Here Φis are component maps of Φ.) If Φ is Fréchet diﬀerentiable at x, then thepartial derivatives of each Φi ∈ RO at x exist, and we have

DΦ,x(t1, , tn) =n

j=1∂jΦ1(x)tj, ,nj=1∂jΦm(x)tj

for all (t1, , tn)∈ Rn,

Trang 9

where ∂jΦi(x)is the jth partial derivative of Φi at x, i = 1, , m, j = 1, , n.10 Or,put diﬀerently, the linear operator DΦ,x satisfies

where Jx

Φ is the Jacobian matrix of Φ at x, that is, Jx

Φ := [∂jΦi(x)]m×n.Just like f (x) in [1] turned out to be the number that identifies the Fréchetderivative of f at x, we see here that the Jacobian matrix of Φ at x identifies theFréchet derivative of Φ at x A certain shift of this operator, namely, (Φ(x)−DΦ,x(x))+

DΦ,x, is an aﬃne map from Rn into Rm that best approximates Φ near x

To prove (6), observe that, since DΦ,x ∈ L(Rn

, Rm), there must exist a matrix

A:= [aij]m×n with DΦ,x(y) = Ay for all y ∈ Rn (Example F.6) Then, by definition,

limω→x

Φ i (ω)−Φ i (x)−nj=1 a ij (ω j −x j )

ω−x 2 = 0, i = 1, , m

It follows that

limε→0

Φy for all y ∈ Rn

Exercise 7 True or false: f : R → Ris continuously differentiable iff it is continuously Fréchet differentiable.

Exercise 8 Defineϕ : R2 → Rbyϕ(0) := 0andϕ(x) := x1 x 2

x 2 for allx = 0.Show thatϕis continuous and both ∂1ϕ(·)and ∂2ϕ(·)are well-defined onR2 , whereasϕ

is not Fréchet diﬀerentiable at0.

∗ Exercise 9.H Let n ∈ N, and O a nonempty open and convex subset of Rn Take any ϕ ∈ C(O) such that x → ∂iϕ(x) is a continuous function on O, i = 1, , n

Show thatϕis Fréchet diﬀerentiable.

Exercise 10 State and prove a generalization of the previous result, which applies to continuous maps from a nonempty open and convex subset ofRn intoRm, n, m∈ N

Exercise 11 Given anyn∈ N, let Xi be a normed linear space,i = 1, , n, and O

a nonempty open subset of the product normed linear space X := XnXi Fix any

x∈ O,and, for eachi, letOi :={zi

∈ Xi : (zi, x−i)∈ O},which is an open subset

of Xi Prove: If ϕ : O → R is Fréchet diﬀerentiable at x, thenϕ(·, x−i) ∈ ROi is Fréchet diﬀerentiable atxi, i = 1, , n,and we have

D ϕ,x (z1, , zn) =ni=1Dϕ(·,x−i ),x i (zi).

(Compare with Example 1.[3].)

10 Please note that I do not at all claim here that Φ is necessarily Fréchet diﬀerentiable at x when the partial derivatives of each Φi∈ R O at x exist This is, in fact, not true, simply because continuity

of a map on R n in each of its components does not imply the overall continuity of the map (See Exercises 8 and 9 below.)

Trang 10

The rest of the examples considered here all work within the context of infinitedimensional normed linear spaces We begin with a particularly simple one, and movetowards more involved examples.

E{dpsoh 2 Define ϕ : C[0, 1] → R by

ϕ(f ) :=

] 1 0

f (t)(h(t)− f(t))dt +

] 1 0(h(t)− f(t))2dt

for any h ∈ C[0, 1] Notice that, as h − f ∞ → 0, the last term here would approach

to 0 faster than h− f ∞ vanishes Indeed, if we define e ∈ RC[0,1] by e(h) :=

U1

0(h(t)− f(t))2dt, then

limh→f

Dϕ(f )(g) = 2

] 1 0

f (t)g(t)dt

Exercise 12.H Show that the mapϕ∈ RC[0,1] defined byϕ(f ) := f (0)2 is Fréchet diﬀerentiable, and compute Dϕ

Exercise 13.H Show that the map ϕ ∈ RC[0,1] defined by ϕ(f ) := 13U1

0 f (t)3dt is Fréchet diﬀerentiable, and compute Dϕ

One-variable diﬀerential calculus is often useful in computing the Fréchet tive of a given function, even when the domain of this function is infinite dimensional

deriva-In particular, Taylor’s Theorem is extremely useful for this purpose Given our presentpurposes, all we need is the following “baby” version of that result

The Second Mean Value Theorem.Let a and b be two distinct real numbers, and

I :=co{a, b} If f : I → R is continuously diﬀerentiable on I and twice diﬀerentiable

on intR(I), then

Trang 11

f (b)− f(a) = f (a)(b − a) +12f (c)(b− a)2 for some c ∈ I\{a, b}.

Proof The idea of the proof is reminiscent of the usual way in which one deducesthe Mean Value Theorem from Rolle’s Theorem (Exercise A.56) Define g : I → Rby

g(t) := f (b)− f(t) − f (t)(b − t) −12M (b− t)2,where M ∈ R is chosen to guarantee that g(a) = 0 Clearly, g is diﬀerentiable onintR(I),and a quick computation yields

g (t) = (b− t)(M − f (t)) for any t ∈ I\{a, b}

Moreover, since g(a) = 0 = g(b) and g ∈ C1(I), Rolle’s Theorem guarantees that

g (c) = 0 for some c ∈ I\{a, b} But then M = f (c), and we find

0 = g(a) = f (b)− f(a) − f (a)(b − a) − 12f (c)(b− a)2,

m)∈ ∞ such that

u(ωi)− u(xi) = u (xi)(ωi− xi) + 12u (ω∗i)(ωi− xi)2and ω∗

i ∈ co{ωi, xi} for each i = 1, 2, Consequently,

ϕ((ωm))− ϕ((xm)) =∞i=1δiu (xi)(ωi− xi) +12∞i=1δiu (ω∗i)(ωi− xi)2 (7)for any (ωm)∈ ∞.11 Again, the trick here is to notice that, as (ωm)→ (xm),the lastterm of this equation would approach to 0 faster than (ωm− xm) ∞ vanishes Since

we assume that u is bounded here, say by the real number M > 0, it is very easy toshow this Define e : ∞ → R by e((ωm)) := 12∞δiu (ω∗i)(ωi− xi)2,and note that

Trang 12

for any (ωm)∈ ∞ It follows that lim(ω m )→(x m )

e((ωm )) (ω m −x m ) ∞

= 0, as desired Hence,

by Proposition 2, we may conclude that

Dϕ,(x m )((ym)) =∞i=1δiu (xi)yi for all (ym)∈ ∞.What does this mean, intuitively? Well, consider the map ψ : ∞→ R defined by

ψ((ym)) := ϕ((xm)) +∞i=1δiu (xi)(yi− xi)

This is an aﬃne map, so its behavior is globally linear And we have just foundout that, near (xm), the behavior of ψ and our original (nonlinear) map ϕ are “verysimilar,” in the sense that these two maps are tangent to each other at (xm) So, thelocal linear behavior of ϕ around (xm)is best captured by the linear behavior of the

h−f ∞ → 0 as h → f, then we may conclude that

But

e(h) ∞h−f ∞ ≤ 12 H ◦ θh ∞ h− f ∞,

so all we need here is to show that H ◦ θh ∞ is uniformly bounded for any h which

is suﬃciently close to f This is quite easy Obviously, we have θh ∞ ≤ f ∞+ 1for any h ∈ N1,C[0,1](f ) Moreover, since H is continuous, there is a number M > 0such that |H (a)| ≤ M for all a ∈ R with |a| ≤ f ∞+ 1 Thus:

e(h) ∞h−f ∞ ≤ M

Trang 13

∗ Exercise 14 (The Nemyitski˘ı Operator ) LetH : R2 → Rbe a continuous function such that∂2H and ∂2∂2H are continuous functions onR2 Define the self-mapΦon

C[0, 1] by Φ(f )(t) := H(t, f (t)).Show thatΦ is Fréchet diﬀerentiable, and

DΦ(f )(g)(t) = ∂2H(t, f (t))g(t) for all f, g ∈ C[0, 1] and 0 ≤ t ≤ 1.

An important topic in functional analysis concerns the determination of normed linear spaces the norms of which are Fréchet diﬀerentiable We will have little to say on this topic in this book, but the following exercises might give you at least an idea about it.

Exercise 15 Let (xm)∈ 2

\{0}, and show that · 2 : 2

→ R+ is Fréchet tiable at(xm)with

diﬀeren-D ·

2 ,(x m ) (ym) = (x1

m ) 2

∞ i=1 xiyi for all (ym) ∈ 2 The following exercise generalizes this observation.

Exercise 16 Let X be a pre-Hilbert space (Exercise J.12) with the inner product φ

A famous theorem of linear analysis states that for any continuous linear functionalL

onX,there exists ay∈ Xsuch thatL(x) = φ(x, y)for allx∈ X.(This is the Riesz Representation Theorem.) Assuming the validity of this fact, show that the norm ·

ofX is Fréchet diﬀerentiable at eachx∈ X\{0},and we haveD · ,x(y) = φ(x,y)x for all x∈ X\{0} and y∈ X

Exercise 17 It is well known that for any L ∈ B( 1, R)there exists an (am) ∈ ∞

such thatL((xm)) =∞aixi.(You don’t have to prove this result here.) Use this fact

to show that the norm · 1 on 1 is not Fréchet diﬀerentiable anywhere.

∗ Exercise 18.H Determine all points at which · ∞ : ∞ → R+ is Fréchet tiable.

Most of the basic rules of diﬀerentiation of one-variable calculus have straightforwardgeneralizations in terms of Fréchet derivatives Just as you would suspect, for in-stance, the Fréchet derivative of a given linear combination of Fréchet diﬀerentiableoperators equals that linear combination of the Fréchet derivatives of the involvedoperators

Proposition 3 Let X and Y be two normed linear spaces, T a subset of X, and

x∈ intX(T ) Let Φ and Ψ be two maps in YT which are Fréchet diﬀerentiable at x.Then, for any real number α, αΦ + Ψ is Fréchet diﬀerentiable at x, and

DαΦ+Ψ,x = αDΦ,x+ DΨ,x

Trang 14

Proof By Proposition 2, there exist maps eΦ and eΨ in YT such that

Φ(ω)− Φ(x) = DΦ,x(ω− x) + eΦ(ω) and Ψ(ω)− Ψ(x) = DΨ,x(ω− x) + eΨ(ω)for all ω ∈ intX(T ), and limω→x eΦ (ω)

Exercise 19 LetX be a normed linear space, andO a nonempty open subset of X

Prove: Ifϕ, ψ ∈ RO are Fréchet diﬀerentiable atx∈ O, then the product operator

ϕψ is Fréchet diﬀerentiable atx, and

Dϕψ,x= ψ(x)D ϕ,x + ϕ(x)D ψ,x

The next result should again be familiar from ordinary calculus

Proposition 4 (The Chain Rule) Let X, Y and Z be normed linear spaces, and

O and U subsets of X and Y , respectively Let Φ ∈ TS and Ψ ∈ ZT are two mapssuch that Φ is Fréchet diﬀerentiable at x ∈ intX(S), and Ψ at Φ(x) ∈ intY(T ) Then,

Ψ◦ Φ is Fréchet diﬀerentiable at x, and

DΨ◦Φ,x= DΨ,Φ(x)◦ DΦ,x.Proof By Proposition 2, there exist maps eΦ ∈ YS and eΨ∈ ZT such that

Φ(ω)− Φ(x) = DΦ,x(ω− x) + eΦ(ω) for all ω ∈ intX(S) (9)and

Ψ(w)− Ψ(Φ(x)) = DΨ,Φ(x)(w− Φ(x)) + eΨ(w) for all w ∈ intY(T ),

e(ω) := DΨ,Φ(x)(eΦ(ω)) + eΨ(Φ(ω))

Trang 15

By Proposition 2, therefore, it remains to show that limω→x ω−xe(ω) = 0 To this end,observe first that

limω→x

e Φ (ω) Yω−x = 0(Section J.4.3) The proof will thus be complete if we can establish that limω→xeΨ (Φ(ω))

0 This requires harder work Note first that

limω→x

Φ(ω)−Φ(x) Y

ω−x = 0,therefore, there exists an ε > 0 such that Φ(ω)−Φ(x) Y

DΦ 0 ,f(h) = 0 and DΦ i ,f(h) = i(f )i−1h, i = 1, , m

Now define Φ : C[0, 1] → R by

Φ(f ) :=mi=0ai

] 1 0

f (t)idt,

where a0, , am ∈ R Using Propositions 3 and 4 we can readily compute the Fréchetderivative of Φ Indeed, if L ∈ B(C[0, 1], R) is defined by L(f) :=U1

0 f (t)dt, then wehave

Φ =mi=0 ai(L◦ Φi)

Trang 16

So, since DL,h= L for any h (why?), Propositions 3 and 4 yield

DΦ,f(g) =mi=0aiDL◦Φi,f(g) =mi=0aiL (DΦ i ,f(g)) =mi=0ai

] 1 0

if (t)i−1g(t)dt

∗ Exercise 20.H (The Hammerstein Operator ) Define the self-mapΦ onC[0, 1] by

Φ(f )(x) :=

] 1 0

θ(x, t)H(t, f (t))dt,

whereθ is a continuous real map on[0, 1]2 andH : R2 → Ris a continuous function such that∂2H and∂2∂2H are well-defined and continuous functions on R2 Use the Chain Rule to show thatΦis Fréchet diﬀerentiable, and compute DΦ

Exercise 21 Take any natural numbers n, m and k Let O and U be nonempty open subsets of Rn andRm, respectively For any givenx∈ O, let Φ : O→ U and

Ψ : U → Rk be two maps that are Fréchet diﬀerentiable atxandΦ(x), respectively Show thatΨ◦ Φis Fréchet diﬀerentiable at x,and

DΨ◦Φ,x(y) = JΦ(x)Ψ JxΦy for all y ∈ Rn, whereJx

Φ and JΦ(x)Ψ are the Jacobian matrices ofΦ(at x) andΨ (atΦ(x)), tively.

respec-Exercise 22 LetXbe a normed linear space,Oa nonempty open and convex subset

ofX,andΦ∈ RO a Fréchet diﬀerentiable map.Fix any distinctx, y ∈ O,and define

F : (0, 1) → Rby F (λ) := Φ(λx + (1− λ)y) Show thatF is diﬀerentiable, and

F (λ) = DΦ,λx+(1−λ)y(x− y)for all t∈ R and 0 < λ < 1

Exercise 23 LetXandY be two normed linear spaces, andϕa Fréchet diﬀerentiable real map on X× Y Fix any x ∈ X, and define ψ ∈ RY by ψ(y) := ϕ(x− y, y)

Use the Chain Rule to prove that ψ is diﬀerentiable and computeDψ

To outline a basic introduction to optimization theory, we also need to go through thenotion of the second Fréchet derivative of real functions Just as in ordinary calculus,the idea is to define this notion as the “derivative of the derivative.” Unfortunately,life gets a bit complicated here Recall that the Fréchet derivative Dϕ of a real function

ϕ defined on an open subset O of a normed linear space X is a function that maps

O into X∗ Therefore, the Fréchet derivative of Dϕ at x ∈ O is a continuous linearfunction that maps X into X∗ Put diﬀerently, the second Fréchet derivative of ϕ at

x is a member of B(X, X∗)

Trang 17

Just in case you find this confusing, let us see how this situation compares withdiﬀerentiating a function of the form ϕ : R2

→ R twice In calculus, by the derivative

of ϕ at x ∈ R2,we understand the gradient of ϕ, that is, the vector (∂1ϕ(x), ∂2ϕ(x))∈

R2 In turn, the second derivative of ϕ at x is the matrix Hx := [∂ijϕ(x)]2×2, where

by ∂ijϕ(x), we understand ∂i∂jϕ(x) (You might recall that this matrix is called theHessian of ϕ at x) Given Example 1.[3], it is only natural that the second Fréchetderivative of ϕ at x is the linear operator induced by the matrix Hx (i.e y → Hxy)and hence it is a linear function that maps R2

into R2 Since the dual of R2

is R2(Example J.7), this situation conforms perfectly with the outline of the previousparagraph

At any rate, things will get clearer below Let us first state the definition of thesecond Fréchet derivative of a real function formally

Dhilqlwlrq Let T be a nonempty subset of a normed linear space X, and ϕ : T → R

a Fréchet differentiable map For any given x ∈ intX(T ), if Dϕ :intX(T ) → X∗ isFréchet differentiable at x, then we say that ϕ is twice Fréchet differentiable at

x In this case, the second Fréchet derivative of ϕ at x, denoted by D2ϕ,x, is amember of B(X, X∗); we define

D2ϕ,x:= DD ϕ ,x

If O is a nonempty open subset of T , and if ϕ is twice Fréchet diﬀerentiable at every

x ∈ O, then we say that ϕ is twice Fréchet diﬀerentiable on O If O = intX(T )here, then ϕ is said to be twice Fréchet diﬀerentiable

The thing to get used to here is that D2

ϕ,x ∈ B(X, X∗), that is, D2

ϕ,x(y) ∈ X∗for each y ∈ X We should thus write D2

ϕ,x(y)(z) for the value of the linear tional D2ϕ,x(y)at z It is, however, customary to write D2ϕ,x(y, z)instead of D2ϕ,x(y)(z),thereby thinking of D2

func-ϕ,x as a function that maps X × X into R From this viewpoint,

D2

ϕ,x is a continuous bilinear functional on X × X (Exercise J.52).12

The following is an analogue (and an easy consequence) of Proposition 2 for thesecond Fréchet derivative of a real function

Proposition 5 Let X be a normed linear space, T a subset of X, x ∈ intX(T ), and

L∈ B(X, X∗).For any ϕ ∈ RT, L is the second Fréchet derivative of ϕ at x if, andonly if, there exists a continuous map e : T → X∗ such that

Dϕ,ω = Dϕ,x+ L(ω− x) + e(ω) for all ω ∈ intX(T ), and lim

ω→x

e(ω) ω−x = 0

12 This custom is fully justified, of course After all, B(X, X ∗ ) “is” the normed linear space of all continuous bilinear functionals on X × X, that is, these two spaces are linearly isometric (Recall Exercise J.63.)

Trang 18

Exercise 24 Prove Proposition 5.

E{dpsoh 6 [1] Let I be an open interval, f ∈ RI

a diﬀerentiable map, and x ∈ I

If f is twice diﬀerentiable at x, then there is an error function e1 : I → R such that

f (ω) = f (x) + f (x)(ω− x) + e1(ω) for all ω ∈ I, and limω→x|ω−x|e1(ω) = 0 (Why?)Hence, by Example 1.[1],

Df,ω(t)− Df,x(t) = (f (ω)− f (x))t = f (x)(ω − x)t + e1(ω)t

for all ω ∈ I and t ∈ R We define L : R → R∗ and e : I → R∗ by L(u)(v) := f (x)uvand e(u)(v) := e1(u)v,respectively Then

Df,ω(t)− Df,x(t) = L(ω− x)(t) + e(ω)(t) for all ω ∈ I and t ∈ R,

and it follows from Proposition 5 that D2

f,x = L, that is,

D2f,x(u, v) = f (x)uv for all u, v ∈ R

Reminder Given any n ∈ N, let S be a subset of Rn with nonempty interior, and

ϕ∈ RS a map such that the jth partial derivative of ϕ (as a real map on intR2(S))exists For any x ∈ intR 2(S) and i, j = 1, , n, the number ∂ijϕ(x) := ∂i∂jϕ(x) isreferred to as a second-order partial derivative of ϕ at x If ∂ijϕ(x) exists foreach x ∈ intR 2(S), then we refer to the map x → ∂ijϕ(x) on intR2(S) as a second-order partial derivative of ϕ (Note A “folk” theorem of advanced calculus saysthat if ∂ijϕ and ∂jiϕare continuous maps (on intR2(S)), then ∂ijϕ = ∂jiϕ.)

[2] Given any n ∈ N, let O be an open subset of Rn, and take any Fréchetdiﬀerentiable map ϕ ∈ RO

If ϕ is twice Fréchet diﬀerentiable at x ∈ O, then allsecond-order partial derivatives of ϕ at x exist, and we have

D2ϕ,x(u, v) =ni=1nj=1∂ijϕ(x)uivj for all u, v ∈ Rn.Thus, the second Fréchet derivative of ϕ at x is none other than the symmetricbilinear functional induced by the so-called Hessian matrix [∂ijϕ(x)]n×n of ϕ at x

Exercise 25 Prove the assertion made in Example 6.[2].

E{dpsoh 7 Let O be a nonempty open and convex subset of a normed linear space

X,and let x and y be two distinct points in O Take any twice Fréchet diﬀerentiablemap ϕ ∈ RO, and define F ∈ R(0,1) by

F (λ) := ϕ(λx + (1− λ)y)

Trang 19

We wish to show that F is twice diﬀerentiable and compute F (Any guesses?)

By Exercise 22, F is diﬀerentiable, and we have

F (α) := Dϕ,αx+(1−α)y(x− y), 0 < α < 1 (12)Define G := F , fix any 0 < λ < 1, and let us agree to write ωα for αx + (1 − α)y forany 0 < α < 1 By Proposition 5, there exists a continuous map e : O → X∗ suchthat

Dϕ,ω = Dϕ,ω λ+ D2ϕ,ωλ(ω− ωλ) + e(ω) for all ω ∈ O,and limω→ωλ ω−ωe(ω)

λ = 0 Thus, for any α ∈ (0, 1)\{λ}, (12) gives

F (α)− F (λ) = Dϕ,ω α(x− y) − Dϕ,ωλ(x− y) = D2ϕ,ω λ(ωα− ωλ, x− y) + e(ωα)(x− y).Since ωα − ωλ = (α− λ)(x − y) and D2

ϕ,ω λ is a bilinear functional (on X × X), wemay divide both sides of this equation by α − λ to get

F (α)−F (λ) α−λ = D2ϕ,ωλ(x− y, x − y) +e(ωα )(x−y)

α−λfor any α ∈ (0, 1)\{λ} But since α → ωα is a continuous map from (0, 1) into X,

f (t)idt, wherea0, , am ∈ R Compute D2

ϕ,f for anyf ∈ C[0, 1]

We have now at hand a potent theory of differentiation which generalizes the classicaltheory There still remains one major difficulty, however Insofar as our basic defi-nition is concerned, we are unable to differentiate a map that is defined on a subset

of a normed linear space with no interior For instance, let S = {(a, 1) : 0 < a < 1},and define ϕ : S → R by ϕ(a, 1) := a2 What is the Fréchet derivative of ϕ? Well,

Trang 20

since S has no interior in R2 our basic definition does not even allow us to posethis question That definition is based on the idea of perturbing (infinitesimally) agiven point in the interior of the domain of a map in any direction in the space, andanalyzing the behavior of the resulting difference-quotient In this example, becauseintR2(S) = ∅, we cannot proceed in this manner Indeed, the variations we considermust be horizontal in this case Put differently, if x ∈ S and we wish to study thedifference-quotient ϕ(ω)−ϕ(x)ω−x , the variation ω − x must belong to the linear subspace

R × {0} So, in this example, the Fréchet derivative of ϕ needs to be viewed as alinear functional from R × {0} into R (and not from R2

into R)

Apparently, there is an obvious way we can generalize the definition of the Fréchetderivative, and capture these sorts of examples All we need is to take the domain ofthe function to be diﬀerentiated as open in the aﬃne manifold it generates Let usfirst give such sets a name

Dhilqlwlrq Let X be a normed linear space A subset S of X is said to berelatively open if |S| > 1 and S is open in aﬀ (S).13

Now consider a real map ϕ whose domain S is relatively open in some normedlinear space Obviously, the diﬀerence-quotient ϕ(ω)−ϕ(x)ω−x makes sense iﬀ ω ∈ S\{x}.Therefore, span(S − x) is the linear space that contains all possible variations about

x, or equivalently, aﬀ (S) is the set of all possible directions of perturbing x (Recallthe example considered above.) We are, therefore, led to define the Fréchet derivative

of ϕ at x as a bounded linear functional on span(S − x)

Notation Let T be a subset of a normed linear space In what follows we denotethe interior of T in aﬀ (T ) as T, that is,

T:= intaﬀ (T )(T )

Thus, T is relatively open iﬀ T = T Moreover, span(T

− z) = span(T − z) for any

z ∈ T,14 and if T is convex and T=∅, then T =ri(T ) (Proposition I.11)

Dhilqlwlrq Let X and Y be two normed linear spaces, and T a subset of X with

If T = {x}, then T is itself an aﬃne manifold, and hence it equals its interior in itself But there

is no sequence in T \{x} that converges to x, so we cannot possibly define the Fréchet derivative of

a map on T This is the reason why I assume |T | > 1 here.

Trang 21

where z ∈ T is arbitrary.16 For any x ∈ T,a map Φ : T → Y is said to be Fréchetdiﬀerentiable at x if there is a continuous linear operator DΦ,x ∈ B(s(T ), Y ) suchthat

limω→x

Φ(ω)−Φ(x)−D Φ,x (ω−x)

The linear operator DΦ,x is called the Fréchet derivative of Φ at x

If Φ is Fréchet differentiable at every x ∈ T, then we say that Φ is Fréchetdifferentiable In this case, we say that Φ is continuously Fréchet differentiable

if the map DΦ : T

→ B(s(T ), Y ), defined by DΦ(x) := DΦ,x, is continuous

Dhilqlwlrq Let T be a subset of a normed linear space X with |T | > 1 and

T = ∅, and take a Fréchet diﬀerentiable map ϕ : T → R For any given x ∈ T,

if Dϕ : T

→ s(T )∗ is Fréchet diﬀerentiable at x (where s(T ) is defined by (13)), wesay that ϕ is twice Fréchet diﬀerentiable at x In this case, the second Fréchetderivative of ϕ at x, denoted by D2ϕ,x, is a member of B(s(T ), s(T )∗); we define

D2

ϕ,x := DD ϕ ,x If ϕ is twice Fréchet diﬀerentiable at every x ∈ T, then we say that

ϕis twice Fréchet diﬀerentiable

These definitions extend the ones given in Sections 1.3 and 1.6 After all, if T

is a subset of X with intX(T ) = ∅, then aﬀ (T ) = X = s(T ) — yes? — so the twodefinitions become identical Moreover, most of our findings in the previous sectionsapply to the Fréchet derivatives of maps defined on any nonsingleton T ⊆ X with

T =∅ All we have to do is to apply those results on s(T ) as opposed to X.17 Forinstance, Proposition 2 becomes the following in this setup

Proposition 2∗ Let X and Y be two normed linear spaces, T a subset of X,

x ∈ T, and L ∈ B(s(T ), Y ), where s(T ) is defined by (13) For any Φ ∈ YT, L isthe Fréchet derivative of Φ at x if, and only if, there exists a continuous map e ∈ YTsuch that

Φ(ω) = Φ(x) + L(ω− x) + e(ω) for all ω ∈ T, and lim

ω→x

e(ω) ω−x = 0

Let us now go back to our silly little example in which the question was to findthe Fréchet derivative of ϕ : S → R where S = {(a, 1) : 0 < a < 1} and ϕ(a, 1) := a2

As S is relatively open, this question now makes sense Besides, it is very easy toanswer For any 0 < a < 1, Dϕ,(a,1) is a linear functional on R×{0}, and just asone would like to see, we have Dϕ,(a,1)(t, 0) = 2at for any t ∈ R Moreover, D2ϕ,(a,1)

16 For any given S ⊆ X, we have span(S − y) = span(S − z) for any y, z ∈ S Right?

17 There is one exception, however In the statement of the Chain Rule (Proposition 4), if we posit that O and U are relatively open, then we need the additional hypothesis that DΦ,x(s(O)) ⊆

DΨ,Φ(x)(s(U )) With this modification, the proof goes through verbatim.

Trang 22

is a linear operator from R×{0} into the dual of R×{0} (which is R×{0} itself), orequivalently, a bilinear functional on (R×{0}) × (R×{0}) We have:

D2ϕ,(a,1)((u, 0), (v, 0)) = 2uv for all u, v ∈ R,

as you can easily check.18

Remark 1 Not only that the definitions we have given in this section reduce tothose of the earlier sections for maps defined on sets with nonempty interior, they arealso consistent with them in the following sense Let X and Y be two normed linearspaces, and S ⊆ X Suppose that Φ : S → Y is Fréchet diﬀerentiable at x ∈ intX(S).Then, for any T ⊆ S with |T | > 1 and x ∈ T,the map Φ|T is Fréchet diﬀerentiable

at x, and we have

DΦ|T,x= DΦ,x|s(T )

One way of stating the classical Mean Value Theorem is the following: Given adiﬀerentiable real map f on an open interval I, for any a, b ∈ I with a < b, thereexists a c ∈ (a, b) such that f(b) − f(a) = f (c)(b − a) This fact extends readily toreal maps defined on a normed linear space

The Generalized Mean Value Theorem Let S be a relatively open and convexsubset of a normed linear space X, and ϕ ∈ RS a Fréchet diﬀerentiable map Then,for any distinct x, y ∈ S,

ϕ(x)− ϕ(y) = Dϕ,z(x− y)for some z ∈ co{x, y}\{x, y}

Notice that the entire action takes place on the line segment co{x, y} here tuitively speaking, then, we should be able to prove this result simply by applyingthe good ol’ Mean Value Theorem on this line segment The following elementaryobservation, which generalizes the result found in Exercise 22, is a means to this end

In-18 Quiz Compute Dϕ,(1 ,1) and D 2

ϕ,( 1 ,1) , assuming this time that S = {(a, 3

2 − a) : 0 < a < 3

2 } (Hint The domain of Dϕ,(1 ,1) is {(t, −t) : t ∈ R}.)

Trang 23

Lemma 1 Let S be a relatively open and convex subset of a normed linear space X,

x and y distinct points in S, and ϕ ∈ RS

a Fréchet diﬀerentiable map If F ∈ R(0,1)

is defined by F (λ) := ϕ(λx + (1 − λ)y), then F is diﬀerentiable, and

Proof of the Generalized Mean Value Theorem Fix any distinct x, y ∈ S, anddefine F : [0, 1] → R by F (λ) := ϕ(λx + (1 − λ)y) Being the composition of twocontinuous functions, F is continuous (Yes?) Moreover, by Lemma 1, F |(0,1) isdiﬀerentiable, and F (λ) = Dϕ,λx+(1−λ)y(x− y) for any 0 < λ < 1 So, applying the

Exercise 27 Let X be a preordered normed linear space Let O be a nonempty open and convex subset ofX, and ϕ∈ RO a Fréchet diﬀerentiable map Show that

if Dϕ,x is a positive linear functional for any x ∈ O, then ϕ is increasing (that is,

ϕ(x)≥ ϕ(y) for any x, y ∈ O withx− y ∈ X+.)

Exercise 28 For any n ∈ N, let S be a nonempty compact and convex subset of

Rn such that intRn(S) =∅ Prove: Ifϕ∈ C(S) is a Fréchet diﬀerentiable function such that ϕ(x) = 0 for all x∈ bdR n(S), then there is an x∗ ∈ intR n(S) such that

Dϕ,x ∗ is the zero functional.

Warning The Mean Value Theorem (indeed, Rolle’s Theorem) cannot be extended

to the context of vector calculus without substantial modification For instance, inthe case of the map Φ : R → R2 defined by Φ(t) := (sin t, cos t), we have Φ(0) = Φ(2π)but DΦ,x(t) = (0, 0) for any 0 ≤ x ≤ 2π and t ∈ R (Check!)

The following generalization of the Second Mean Value Theorem is also worthnoting It will prove very handy in Section 3 when we look into the properties ofdiﬀerentiable concave functionals

The Generalized Second Mean Value Theorem Let S be a relatively openand convex subset of a normed linear space X, and ϕ ∈ RS a continuously Fréchet

Trang 24

diﬀerentiable map If ϕ is twice Fréchet diﬀerentiable,19 then, for any distinct x, y ∈S,

ϕ(x)− ϕ(y) = Dϕ,y(x− y) +12D2ϕ,z(x− y, x − y)for some z ∈ co{x, y}\{x, y}

Exercise 29 H Prove the Generalized Second Mean Value Theorem.

Exercise 30 (A Taylor’s Formula with Remainder ) LetO be a nonempty open and convex subset of a normed linear space X, and ϕ ∈ RO a continuously Fréchet diﬀerentiable map which is also twice Fréchet diﬀerentiable Show that, for each

x∈ X,there is a (remainder) functionr∈ RX such that

ϕ(x + z) − ϕ(x) = D ϕ,x (z) +12D2ϕ,x(z, z) + r(z) for any z ∈ X withx + z ∈ O,and limz→0 r(z)z 2 = 0

E{dpsoh 8 Let O be a nonempty open subset of R2,and take any twice continuouslydifferentiable map ϕ : O → R One can show that ϕ is not only continuously Fréchetdifferentiable, but it is also twice Fréchet differentiable.20 Moreover, as we notedearlier, a “folk” theorem of multivariate calculus says that ∂12ϕ(ω) = ∂21ϕ(ω)for any

ω∈ O.21 Consequently, Example 6.[2] yields

D2ϕ,z(u, u) = ∂11ϕ(z)u21+ 2∂12ϕ(z)u1u2+ ∂22ϕ(z)u22for any z ∈ O and u ∈ R2 Combining this with the Generalized Second Mean ValueTheorem and Example 1.[3], therefore, we obtain the following result of advancedcalculus: For every distinct x, y ∈ O, there exists a z ∈ co{x, y}\{x, y} such that

ϕ(x)− ϕ(y) = ∂1ϕ(y)(x1 − y1) + ∂2ϕ(y)(x2− y2) + E(z)where

E(z) := 12

∂11ϕ(z)(x1− y1)2+ 2∂12ϕ(z)(x1− y1)(x2− y2) + ∂22ϕ(z)(x2− y2)2

19 It may seem like there is a redundancy in the hypotheses here, but in fact this is not the case Twice Fréchet diﬀerentiability of ϕ does not, in general, imply its continuous Fréchet diﬀerentiability.

20 While this may look like quite a bit to swallow, the proof is hidden in Exercises 9 and 10.

21 You have surely seen this fact before (An Hessian matrix is always symmetric, no?) Its proof, while a bit tedious, follows basically from the definitions (but note that the assumption of ∂12 (or

∂21) being continuous is essential).

Trang 25

2.2 The Mean Value Inequality

We noted above that the Generalized Mean Value Theorem need not apply to mapsthat are not real-valued It turns out that this is not a major diﬃculty Indeed,most of the results of one-variable calculus that can be deduced from the Mean ValueTheorem can also be obtained by using the so-called Mean Value Inequality: If O is anopen subset of R that contains the open interval (a, b), and f ∈ RO is diﬀerentiable,then

f (b)− f(a) ≤ sup {|f (t)| : t ∈ O} (b − a)

It turns out that this result extends nicely to the present framework, and this sion is all one needs for most purposes.22

exten-The Mean Value Inequality Let X and Y be two normed linear spaces, and O anonempty open subset of X Let Φ ∈ YO be Fréchet diﬀerentiable Then, for every

x, y∈ O with co{x, y} ⊆ O, there exists a real number K ≥ 0 such that

Lemma 2 If x, y and z are any points in a normed linear space with z ∈ co{x, y},then

x− y = x − z + z − y Proof By hypothesis, there exists a 0 ≤ λ ≤ 1 such that z = λx + (1 − λ)y Then

x− z = (1 − λ)(x − y) and z − y = λ(x − y) Thus

x− z + z − y = (1 − λ) x − y + λ x − y = x − y ,

22 For simplicity we state this result for maps defined on open sets, but it is straightforward to extend it to the case of maps defined on relatively open sets.

Trang 26

as we sought Proof of the Mean Value Inequality Fix any x, y ∈ O with co{x, y} ⊆ O, andtake any K ≥ 0 that satisfies (15).23 Towards deriving a contradiction, suppose thatthere exists an ε > 0 such that

Φ(x)− Φ(y) Y > (K + ε) x− y Let x0 := x and y0 := y Since, by subadditivity of · Y ,

− y1 Proceeding this way inductively, we obtain two sequences (xm)and (ym) in co{x, y} and a vector z ∈ co{x, y} with the following properties: For all

m∈ N,

(i) z ∈ co{xm, ym

},(ii) lim xm = z = lim ym,

(iii) Φ(xm)− Φ(ym) Y > (K + ε) xm− ym

(Verify this carefully.24)

Since Φ is Fréchet diﬀerentiable at z, there is a δ > 0 such that Nδ,X(z)⊆ O and

23 Quiz How am I so sure that such a real number K exists?

24 I am implicitly invoking Lemma 2 here along with Cantor’s Nested Interval Lemma Please make sure I’m not overlooking anything.

Trang 27

Recall that a major corollary of the Mean Value Theorem is the fact that a realfunction whose derivative vanishes at an open interval must be constant on thatinterval The Mean Value Inequality yields the following generalization of this fact.

Corollary 1 Let X and Y be two normed linear spaces, and O a nonempty openand convex subset of X If Φ ∈ YO is Fréchet diﬀerentiable and DΦ,x is the zerooperator for each x ∈ O, then there is a y ∈ Y such that Φ(x) = y for all x ∈ O

Exercise 31 Prove Corollary 1.

Exercise 32.H Show that the term “convex” can be replaced with “connected” in Corollary 1.

Exercise 33.H Let X and Y be two normed linear spaces, O a nonempty open and convex subset ofX,andx, y ∈ O.Show that ifΦ∈ YO is Fréchet diﬀerentiable and

xo ∈ O,then

Φ(x)− Φ(y) − DΦ,x o(x− y) Y ≤ K x − y

for any K ≥ sup { DΦ,w− DΦ,x o

∗ : w∈ co{x, y}}

You might recall that a concave function f defined on an open interval I possesses verynice diﬀerentiability properties In particular, any such f is diﬀerentiable everywhere

on I but countably many points While our main goal is to study those concavefunctions that are defined on convex subsets of an arbitrary normed linear space, itmay still be a good idea to warm up by sketching a quick proof of this elementaryfact

E{dpsoh 9 Let I be an open interval and f ∈ RI a concave map It is easy

to check that the right-derivative f+ of f is a well-defined and decreasing function

on I (Prove!) Thus if (xm) ∈ I∞ is a decreasing sequence with xm x, we havelim f+(xm)≤ f+(x) On the other hand, concavity implies that

f+(xm)≥ f (y)−f(xm )

y−x m for all y ∈ I with y > xm,for each m, so, since f is continuous (Corollary A.2), we get

limm→∞f+(xm)≥ f (y)−f(x)y−x for all y ∈ I with y > x

Trang 28

In turn, this implies lim f+(xm) ≥ f+(x), so we find lim f+(xm) = f+(x) But theanalogous reasoning would yield this equation if (xm) was an increasing sequencewith xm x (Yes?) Conclusion: f is diﬀerentiable at x iﬀ f+ is continuous at x.But since f+ is a monotonic function, it can have at most countably many points

of discontinuity (Exercise B.8) Therefore, f is diﬀerentiable everywhere on I but

Exercise 34 LetI be an open interval and f ∈ RI a concave map Show that there exists a countable subsetS ofI such thatf ∈ C(I\S)

Unfortunately, we are confronted with various diﬃculties in higher dimensions.For instance, the map ϕ : R2

→ R defined by ϕ(u, v) := − |u| is a concave functionwhich is not diﬀerentiable on the uncountable set R × {0} Worse still, a concavefunction on an infinite-dimensional normed linear space may not possess a Fréchetderivative anywhere! For example, ϕ : 1

→ R defined by ϕ((xm)) := − (xm) 1 isnot Fréchet diﬀerentiable at any point in its domain (Exercise 17) Nevertheless, theconcave functions that arise in most economic applications are in fact continuouslydiﬀerentiable, and hence for most practical purposes, these observations are not tooproblematic

∗Remark 2 Just in case the comments above sounded to you overly dramatic, wemention here, without proof, two positive results about the Fréchet diﬀerentiability

of concave maps

(a) Take any n ∈ N, and let us agree to say that a set S in Rn is null if, for all

ε > 0,there exist countably many n-cubes such that (i) S is contained in the union ofthese cubes, and (ii) the sum of the side lengths of these cubes is at most ε (RecallSection D.1.4.) One can show that if O is an open and convex subset of Rn and ϕ is

a concave map on O, then the set of all points x ∈ O at which ϕ fails to be Fréchetdiﬀerentiable is null

(b) Asplund’s Theorem If X is a Banach space such that X∗ is separable, and ϕ

is a continuous and concave real function defined on an open and convex subset O of

X, then the set of all points at which ϕ is Fréchet diﬀerentiable is dense in O.25

Exercise 35.H LetO be a nonempty open and convex subset of a normed linear space, and ϕ∈ RO.We say that an aﬃne map ϑ : X → R is a support of ϕ at x∈ O

if ϑ|O ≥ ϕ and ϑ(x) = ϕ(x) Show that if ϕ is a concave map which is Fréchet diﬀerentiable atx∈ O,then ϕhas a unique support at x.26

25 Asplund’s Theorem is in fact more general than this It says that the set of all points at which

ϕ is Fréchet diﬀerentiable is a countable intersection of open and dense subsets of O The easiest proof of this result that I know is the one given by Preiss and Zajicek (1984) who in fact prove something even more general.

26 Warning The converse of this statement is false in general, but it it is true when X is a Euclidean space.

Trang 29

3.2 Fréchet Diﬀerentiable Concave Maps

Recall that there are various useful ways of characterizing the concavity of entiable real maps defined on an open interval It is more than likely that you arefamiliar with the fact that if I is an open interval and ϕ ∈ RI is diﬀerentiable andconcave, then the diﬀerence-quotient ϕ(y)−ϕ(x)y−x is less than ϕ (x) if y > x, while itexceeds ϕ (x) if x > y (Draw a picture.) Put more concisely,

diﬀer-ϕ(y)− ϕ(x) ≤ ϕ (x)(y − x) for all x, y ∈ I

A useful consequence of this observation is that a diﬀerentiable ϕ ∈ RI is concave iﬀits derivative is a decreasing function on I We now extend these facts to the context

of real maps defined on open and convex subsets of an arbitrary normed linear space.Proposition 6 Let S be a relatively open and convex subset of a normed linearspace, and ϕ ∈ RS a Fréchet diﬀerentiable map Then, ϕ is concave if, and only if,

ϕ(y)− ϕ(x) ≤ Dϕ,x(y− x) for all x, y ∈ S (16)Proof Suppose that ϕ is concave, and take any x, y ∈ S Then,

ϕ(x + λ(y− x)) = ϕ((1 − λ)x + λy) ≥ (1 − λ)ϕ(x) + λϕ(y)

for all 0 ≤ λ ≤ 1 Thus

1

λ (ϕ(x + λ(y− x)) − ϕ(x) − Dϕ,x(λ(y− x))) ≥ ϕ(y) − ϕ(x) − Dϕ,x(y− x)for each 0 < λ < 1 We obtain (16) by letting λ → 0 in this expression, and using thedefinition of Dϕ,x.27

Conversely, suppose (16) is true Take any x, y ∈ S and 0 < λ < 1 If we set

z := λx + (1− λ)y, then λ(x − z) + (1 − λ)(y − z) = 0, so (16) implies

ϕ(z) = λϕ(z) + (1− λ)ϕ(z) + Dϕ,z(λ(x− z) + (1 − λ)(y − z))

= λ(ϕ(z) + Dϕ,z(x− z)) + (1 − λ)(ϕ(z) + Dϕ,z(y − z))

≥ λϕ(x) + (1 − λ)ϕ(y),

Corollary 2 Let S be a relatively open and convex subset of a normed linear space

X, and ϕ ∈ RS a Fréchet diﬀerentiable map Then, ϕ is concave if, and only if,

(Dϕ,y− Dϕ,x) (y− x) ≤ 0 for all x, y ∈ S (17)

27 Notice that I used here the Fréchet diﬀerentiability of ϕ only at x Thus: ϕ(y) − ϕ(x) ≤

D ϕ,x (y − x) holds for any y ∈ S and concave ϕ ∈ R S , provided that D ϕ,x exists.

Trang 30

Exercise 36 Prove Corollary 2.

Exercise 37 Verify that in Proposition 6 and Corollary 2 we can replace the term

“concave” with “strictly concave,” provided that we replace “≤” with “<” in (16) and (17).

Exercise 38.H Given any n ∈ N, let O be a nonempty open and convex subset of

Rn, and assume thatϕ∈ RO is a map such that ∂iϕ(x)exists for each x ∈ O and

i = 1, , n Show that ifϕ is concave,then

ϕ(y) − ϕ(x) ≤ni=1 ∂iϕ(x)(yi− x i ) for all x, y ∈ O (18) Conversely, if (18) holds, and each ∂iϕ(·) is a continuous function on O, then ϕ is concave.

If X = R in Corollary 2, then Dϕ,z(t) = ϕ (z)t for all t ∈ R and z ∈ S, so (17)reduces to (ϕ (y) − ϕ (x))(y − x) ≤ 0 for all x, y ∈ S So, a very special case ofCorollary 2 is the well-known fact that a diﬀerentiable real map on an open interval

is concave iff its derivative is decreasing on that interval But a differentiable realfunction on a given open interval is decreasing iff its derivative is not strictly positiveanywhere on that interval It follows that a twice differentiable real map on an openinterval is concave iff the second derivative of that map is less than zero everywhere.The following result generalizes this observation

Corollary 3 Let S be a relatively open and convex subset of a normed linear space

X,and ϕ ∈ RS a continuously Fréchet diﬀerentiable map which is, also, twice Fréchetdiﬀerentiable Then, ϕ is concave if, and only if,

D2ϕ,x(z, z)≤ 0 for all (x, z) ∈ S × X (19)Proof If (19) is true, then, by the Generalized Second Mean Value Theorem, wehave ϕ(x) −ϕ(y) ≤ Dϕ,y(x−y) for any x, y ∈ S, so the claim follows from Proposition

6 Conversely, suppose ϕ is concave, and take any x ∈ S and z ∈ span(S−x) Observethat x + λz ∈ x+ span(S − x) = aﬀ (S), for any λ ∈ R So, since S is relatively openand x ∈ S, we can choose a δ > 0 small enough so that (i) x + λz ∈ S for all

−δ < λ < δ; and (ii) the map f ∈ R(−δ,δ) defined by f (λ) := ϕ(x + λz) is concave

so that f (0) ≤ 0 (Right?) But we have f (λ) = D2ϕ,x+λz(z, z) for any −δ < λ < δ.(Yes?) Hence, f (0) ≤ 0 implies D2

Exercise 39 Show that, under the conditions of Corollary 3, ifD2

ϕ,x(z, z) < 0for all

(x, z)∈ S × X,then ϕis strictly concave, but not conversely.

You of course know that the derivative of a diﬀerentiable real map need not becontinuous (That is, a diﬀerentiable real function need not be continuously dif-ferentiable.) One of the remarkable properties of concave (and hence convex) real

Định dạng
Số trang	60
Dung lượng	687,08 KB