Exploratory Problems for Chapter 15 511 (a) Let N = N(t) be the number of moles of substance A at time t . Translate the statement above into mathematical language. (Note: The number of moles of substance B should be expressed in terms of the number of moles of substance A.) (b) N(t) is a decreasing function. The rate at which N is changing is a function of N , the number of moles of substance A. When the rate at which A is being converted to B is highest, how many moles are there of substance A? PART V Adding Sophistication to Your Differentiation 16 CHAPTER Taking the Derivative of Composite Functions 16.1 THE CHAIN RULE We can construct conglomerate functions in two different ways. One way is to combine the functions’ outputs by taking, for example, their sum or product. We can differentiate a sum by summing the derivatives and differentiate a product by applying the Product Rule. Another way to construct a conglomerate is to have functions operate in an assembly- line manner. In this configuration, the output of one function becomes the input of the next, creating a composite function. In this section we will look at the derivatives of composite functions. The work we do will have plentiful rewards, as the results have extensive application. Our goal is to express the derivative of the composite function f (g(x)) in terms of f , g, and their derivatives. Our assumption throughout is that f and g are differentiable functions. Many functions we can’t yet differentiate can be decomposed and expressed as the composite of simpler functions whose derivatives we know. For example: 513 514 CHAPTER 16 Taking the Derivative of Composite Functions h(x) = √ ln x can be decomposed into f (g(x)), where f(x)= √ x and g(x) = ln x. h(x) = (x 2 + 1) 30 can be decomposed into f (g(x)), where f(x)=x 30 and g(x) =x 2 + 1. h(x) = e x 2 can be decomposed into f (g(x)), where f(x)=e x and g(x) =x 2 . h(x) = ln(x + 8x 2 ) can be decomposed into f (g(x)), where f(x)=ln x and g(x) = x + 8x 2 . We’ ll look at the problem of differentiating a composite function in the context of the next example. ◆ EXAMPLE 16.1 The number of fish a lake can support varies with the water quality. The water quality is affected by industry around the lake; the level of grime in the lake varies with time. For the purposes of this example, we’ll assume that the level of grime in the lake is always increasing. The number of fish decreases as the level of grime goes up. Let g(t) give the level of grime in the lake as a function of time t, measured in years. Let f(g) be the fish population as a function of the amount of grime. Then the population of fish as a function of time is the composite function f ( g(t) ) . Assume that f and g are differentiable functions. Find the rate at which the fish population is changing over time. SOLUTION We must compute d dt f (g(t)),or df dt . Essentially, we argue as follows. f t = f g · g t . From a purely algebraic standpoint, this must be true, provided g and t are not zero. Let g = g(t + t) − g(t) and f = f(g +g) − f(g).Ast → 0, we know g → 0 because g is a continuous function. lim t→0 f t = lim t→0 f g · g t lim t→0 f t = lim t→0 f g · lim t→0 g t lim t→0 f t = lim g→0 f g · lim t→0 g t df dt = df dg · dg dt For the purposes of this problem, since we’ve asserted that g is increasing, we haven’t run into trouble. In terms of generalizing, we have trouble only if g = 0infinitely many times as t → 0. But if this is the case, it can be shown that both dg dt =0 and df dt =0, so the equation df dt = df dg · dg dt still holds. Let’s return to the fish and make things more concrete. Suppose we want to find the rate of change of the fish population with respect to time at t = 3, and suppose that at that time the grime level in the lake is 700 units. Let’s look at f t , the rate of change of the number of fish over the time interval [3, 3 + t], where t is very small. Over a given time interval, the level of grime varies, and this causes a fluctuation in the number of fish in the lake. We’re interested in finding out how a change in time, t, affects the fish population. The change in time affects the fish population indirectly via the change in the grime level of the lake. There is a chain reaction; time affects grime, and grime affects 16.1 The Chain Rule 515 fish. Therefore, we first look at the change in the level of grime produced by a t change in time. How does g(t) change during this small interval? We know that dg dt ≈ g t for t very small, so we can solve for g. g ≈ dg dt t Looking at the time interval [3, 3 + t] gives us the following. g = g(3 + t) − g(3) g ≈ dg dt t=3 t ≈ g (3)t g g(t) g(3 + ∆t) 3 + ∆t ∆t 3 700 = g(3) ∆g t Figure 16.1 Now we need to determine how this change in the level of grime, g, will affect the fish population. (We’ve assumed in the problem that g is always increasing, so g = 0.) At t = 3, the grime level is 700. As the grime level changes from 700 to 700 + g, how does the fish population change? As above, we can write df dg ≈ f g for g very small. Because t is very small (and will approach zero), g is very small as well. 1 Solving for f gives f ≈ df dg g. Focusing on what is happening at t = 3 (when g = 700) leads to this equation. f = f(g(3+ t)) − f(g(3)) = f(700 + g) − f(700) f ≈ df dg g=700 · g ≈ f (700)g 1 This is because g is a continuous function. We know g is continuous because we are working under the assumption that g is differentiable. 516 CHAPTER 16 Taking the Derivative of Composite Functions f g f(g) f(700) |∆f | 700 700 700 + ∆g 700 + ∆g f(700 + ∆g) ∆g g t g(t) 3 + ∆t3 ∆t ∆g Figure 16.2 We can use the expression for g on the preceding page. f ≈ f (700)g ≈ f (700)g (3)t ≈ f (g(3)) · g (3)t Now we can find an expression for f t , the rate of change of the population with respect to time. f t ≈ f (g(3))g (3)t t ≈ f (g(3))g (3) Let’s look at the approximations we made in this discussion. As t gets closer and closer to zero, the approximation dg dt ≈ g t gets better and better, because lim t→0 g t = dg dt . As t approaches zero, g = g(t + t) − g(t) also approaches zero. As g gets closer and closer to zero, the approximation df dg ≈ f g gets better and better, because lim g→0 f g = df dg . This leads us to conclude that d dt f(g(t)) t=3 = f (g(3)) · g (3). In fact, there was nothing special about the time t = 3, so this relation should hold for all values of t: f (g(t)) = f (g(t)) · g (t). A unit analysis gives fish time = fish grime · grime time , which makes sense. The crucial charac- teristic of f and g is that they are differentiable. In our discussion, we need the assumption 16.1 The Chain Rule 517 that g = 0, but, as previously mentioned, there are ways of getting around this restriction. ◆ The result can be stated for any two differentiable functions f and g; it goes by the name “the Chain Rule.” The Chain Rule: d dt f(g(t)) = f (g(t)) · g (t) or df dt = df dg · dg dt Interpreting the Chain Rule Thoughts on the form d dt f (g(t)) = f ( g(t) ) · g (t): More informally, we can state the Chain Rule as d dt ( f(mess) ) = f (mess) · (mess) , where mess, of course, is just a function of t;it’sg(t). Notice that the derivative of f is evaluated at g(t); f (g(t)), not f (t). We obtain f (g(t)) by calculating f (x) and replacing x by g(t). Think about our example; f is a function of grime, so f is also a function of grime, not of time. We evaluated f at g = 700 (or g(3)), not at g = 3. f (g(t)) = f (g(t)) · g (t) = (derivative of f evaluated at g(t )) · (derivative of g) Thoughts on the form df dt = df dg · dg dt : Leibniz’s notation gives us a very nice way of expressing the Chain Rule. Although df dg and dg dt are not fractions, the notation works well; that is the genius of it. Let’s take a few more quick passes on interpreting the Chain Rule in this form. If g changes three times as fast as t and f changes twice as fast as g, then f changes 6 times as fast as t. df dt = df dg · dg dt Or, think of gears—either interlocking or connected by a chain as in bicycle gears. Suppose the little gear spins 12 times per minute and the big gear spins once for every three turns of the little gear. Then the big gear spins 4 times per minute. rotations of big gear rotations of little gear · rotations of little gear minute = rotations of big gear minute . Applying the Chain Rule Let’s apply the Chain Rule to the functions h(x) = f(x+k) and j(x)=f(kx),correspond- ing to a horizontal shift of f and a horizontal compression of f , respectively. If you haven’t previously done so, first spend a minute trying to determine h (x) and j (x) by graphical means. 518 CHAPTER 16 Taking the Derivative of Composite Functions ◆ EXAMPLE 16.2 f(x +k) can be thought of as the composite f (g(x)), where g(x) = x + k. [f(x +k)] = f (g(x)) · g (x) = f (x + k) · 1 = f (x + k) This is in agreement with our graphical intuition. ◆ ◆ EXAMPLE 16.3 f(kx)can be thought of as the composite f (g(x)), where g(x) = kx. [f(kx)] =f (g(x)) · g (x) = f (kx) · k = kf (kx) This makes sense graphically. Suppose k>0.Because the function’s graph is compressed horizontally, its derivative will be as well. This horizontal compression results in steeper slopes; hence f (kx) is multiplied by k. ◆ Some functions that we could differentiate by taking advantage of laws of logs and exponentials we can now differentiate using the Chain Rule. The next example illustrates these options. ◆ EXAMPLE 16.4 Find the derivatives of the following, where k is a constant. (a) q(x) = e kx+2 (b) s(x) = ln(kx) SOLUTION (a) Using Exponent Laws: q(x) = e kx e 2 ,soq (x) = e 2 ke kx = k · e kx+2 . Using the Chain Rule: Let the inside function, g(x),bekx + 2 and the outside function, f (u),bee u .Then g (x) = k, f (u) = e u , and f (g(x)) = e g(x) = e kx+2 .So q (x) = f (g(x)) · g (x) = e kx+2 · k. (b) Using Log Laws: s(x) = ln k + ln x.lnkis a constant, so s (x) = 1 x . Using the Chain Rule: Let the inside function, g(x),bekx and the outside function, f (u),belnu.Then g (x) = k, f (u) = 1 u , and f (g(x)) = 1 g(x) = 1 kx .So s (x) = f (g(x)) · g (x) = 1 kx · k = 1 x . ◆ Observe that, by using the Chain Rule, we can generalize the three basic derivative rules. d dx [x n ] = nx n−1 can be generalized to d dx g(x) n = n g(x) n−1 · dg dx d dx [b x ] = ln b · b x can be generalized to d dx b g(x) = ln b · b g(x) · dg dx d dx log b x = 1 ln b · 1 x can be generalized to d dx log b g(x) = 1 ln b · 1 g(x) · dg dx ◆ EXAMPLE 16.5 Decompose each of the following functions into f (g(x)) and then compute the derivative. (a) j(x)=(x 6 + 5x 3 + x 2 ) 8 (b) k(x) = ln(x 3 + 3 x ) (c) h(x) = 3 x 2 16.1 The Chain Rule 519 SOLUTION (a) j(x)=(x 6 + 5x 3 + x 2 ) 8 . Multiplying out would be terribly tiring and tedious. Instead, use the Chain Rule, where the inside function, g(x),isx 6 +5x 3 +x 2 and the outside function, f (u),isu 8 .(Check that f (g(x)) = j(x).) j (x) = f (g(x)) · g (x) We know f (u) = 8u 7 ,so f (g(x)) = 8 g(x) 7 = 8(x 6 + 5x 3 + x 2 ) 7 . g (x) = 6x 5 + 15x 2 + 2x. So j (x) = 8(x 6 + 5x 3 + x 2 ) 7 · (6x 5 + 15x 2 + 2x). This is a lot easier than multiplying out in the beginning! Basically, this function is of the form (mess) 8 , so its derivative is 8(mess) 7 · (mess) . (b) k(x) = ln(x 3 + 3 x ). Let the inside function, g(x),bex 3 +3 x and the outside function be f (u) = ln u. (Check that f (g(x)) = k(x).) k (x) = f (g(x)) · g (x) We know f (u) = 1 u ,so f (g(x)) = 1 g(x) = 1 x 3 + 3 x g (x) = 3x 2 + (ln 3)3 x . So k (x) = 1 x 3 +3 x · (3x 2 + (ln 3)3 x ). Basically, this function is of the form ln (mess), so its derivative is 1 mess · (mess) . (c) h(x) = 3 x 2 . Let the inside function, g(x),bex 2 and the outside function be f (u) = 3 u . h (x) = f (g(x)) · g (x) We know f (u) = (ln 3)3 u ,so f (g(x)) = (ln 3)3 g(x) = (ln 3)3 x 2 . g (x) = 2x So h (x) = (ln 3)3 x 2 (2x) = (2ln3)(x)(3 x 2 ). Basically, this function is of the form 3 mess , so its derivative is (ln 3)3 mess · (mess) . ◆ PROBLEMS FOR SECTION 16.1 1. (a) Which of the following are equal to (ln x) 2 ? i. (ln x)(ln x) ii. ln x 2 iii. ln[(x)(x)]iv.2lnx (b) Which of the following are equal to 2 ln x? i. (ln x)(ln x) ii. ln x 2 iii. ln[(x)(x)] (c) Differentiate y = (ln x) 2 . (Do this twice, first using the product rule and then using the Chain Rule.) (d) Differentiate y = ln x 2 . (Do this twice, first using the log rules and the derivative of ln x and then using the Chain Rule.) 520 CHAPTER 16 Taking the Derivative of Composite Functions In Problems 2 through 20, find f (x). Do these problems without using the Quotient Rule. 2. (a) f(x)=3(x + 2) −5 (b) f(x)=2(3x+7) −8 3. f(x)=ln √ πx + 1 + √ πx + (π x + π) 5 + 1 (πx 2 +1) 3 (Hint: Use log operations to simplify the first term.) 4. f(x)= x (x 3 +7x) 4 5. f(x)= e x 3x 2 +1 6. f(x)=e 5x (1+2x) 6 7. f(x)=(1− 1 x )e −x 8. f(x)=ln( √ x 3 )e 6x 9. f(x)=5ln(2x 2 +3x) 10. f(x)=(3x 3 +2x) 13 11. f(x)= e πx (x+x 2 ) 3 12. f(x)= π 2 3(x 3 +2) 6 13. f(x)= 1 x 3 +7x+5 14. f(x)= 3 x 2 x+1 15. f(x)=x5 x+1 2 16. f(x)= 4 √ e x +1 17. f(x)= e x + ln(x +1) 2 18. f(x)= 1 ln(x 2 +2) 19. f(x)=ln(e x + x 2 ) 20. f(x)=x ln x x 2 +1 21. Find a formula for dy dx if y = f (g(h(x))), where f , g, and h are differentiable every- where. In Problems 22 through 25, graph f(x),labeling the x-coordinates of all local extrema. Strategize. Is it more convenient to keep expressions factored? . goal is to express the derivative of the composite function f (g(x)) in terms of f , g, and their derivatives. Our assumption throughout is that f and g are differentiable functions. Many functions. x can be decomposed into f (g(x)), where f(x)= √ x and g(x) = ln x. h(x) = (x 2 + 1) 30 can be decomposed into f (g(x)), where f(x)=x 30 and g(x) =x 2 + 1. h(x) = e x 2 can be decomposed into. can be shown that both dg dt =0 and df dt =0, so the equation df dt = df dg · dg dt still holds. Let’s return to the fish and make things more concrete. Suppose we want to find the rate of change