Introduction to Probability phần 7 docx

298 CHAPTER 7. SUMS OF RANDOM VARIABLES 5 10 15 20 0.025 0.05 0.075 0.1 0.125 0.15 Figure 7.4: Chi-squared density with 5 degrees of freedom. 0 5 10 15 20 25 30 0 0.025 0.05 0.075 0.1 0.125 0.15 1000 experiments 60 rolls per experiment Figure 7.5: Rolling a fair die. 7.2. SUMS OF CONTINUOUS RANDOM VARIABLES 299 1 2 3 4 5 6 7 8 0 0.2 0.4 0.6 0.8 1 n = 2 n = 4 n = 6 n = 8 n = 10 Figure 7.6: Convolution of n uniform densities. Independent Trials We now consider briefly the distribution of the sum of n independent random variables, all having the same density function. If X 1 , X 2 , . . . , X n are these random variables and S n = X 1 + X 2 + ··· + X n is their sum, then we will have f S n (x) = (f X 1 ∗ f X 2 ∗ ··· ∗ f X n ) (x) , where the right-hand side is an n-fold convolution. It is possible to calculate this density for general values of n in certain simple cases. Example 7.9 Supp ose the X i are uniformly distributed on the interval [0, 1]. Then f X i (x) =  1, if 0 ≤ x ≤ 1, 0, otherwise, and f S n (x) is given by the formula 4 f S n (x) =  1 (n−1)!  0≤j≤x (−1) j  n j  (x − j) n−1 , if 0 < x < n, 0, otherwise. The density f S n (x) for n = 2, 4, 6, 8, 10 is shown in Figure 7.6. If the X i are distributed normally, with mean 0 and variance 1, then (cf. Exam- ple 7.5) f X i (x) = 1 √ 2π e −x 2 /2 , 4 J. B. Uspensky, Introduction to Mathematical Probability (New York: McGraw-Hill, 1937), p. 277. 300 CHAPTER 7. SUMS OF RANDOM VARIABLES -15 -10 -5 5 10 15 0.025 0.05 0.075 0.1 0.125 0.15 0.175 n = 5 n = 10 n = 15 n = 20 n = 25 Figure 7.7: Convolution of n standard normal densities. and f S n (x) = 1 √ 2πn e −x 2 /2n . Here the density f S n for n = 5, 10, 15, 20, 25 is shown in Figure 7.7. If the X i are all exponentially distributed, with mean 1/λ, then f X i (x) = λe −λx , and f S n (x) = λe −λx (λx) n−1 (n − 1)! . In this case the density f S n for n = 2, 4, 6, 8, 10 is shown in Figure 7.8. ✷ Exercises 1 Let X and Y be independent real-valued random variables with density functions f X (x) and f Y (y), respectively. Show that the density function of the sum X + Y is the convolution of the functions f X (x) and f Y (y). Hint: Let ¯ X be the joint random variable (X, Y ). Then the joint density function of ¯ X is f X (x)f Y (y), since X and Y are independent. Now compute the probability that X +Y ≤ z, by integrating the joint density function over the appropriate region in the plane. This gives the cumulative distribution function of Z. Now differentiate this function with respect to z to obtain the density function of z. 2 Let X and Y be independent random variables defined on the space Ω, with density functions f X and f Y , respectively. Suppose that Z = X + Y . Find the density f Z of Z if 7.2. SUMS OF CONTINUOUS RANDOM VARIABLES 301 5 10 15 20 0.05 0.1 0.15 0.2 0.25 0.3 0.35 n = 2 n = 4 n = 6 n = 8 n = 10 Figure 7.8: Convolution of n exponential densities with λ = 1. (a) f X (x) = f Y (x) =  1/2, if −1 ≤ x ≤ +1, 0, otherwise. (b) f X (x) = f Y (x) =  1/2, if 3 ≤ x ≤ 5, 0, otherwise. (c) f X (x) =  1/2, if −1 ≤ x ≤ 1, 0, otherwise. f Y (x) =  1/2, if 3 ≤ x ≤ 5, 0, otherwise. (d) What can you say about the set E = {z : f Z (z) > 0 } in each case? 3 Suppose again that Z = X + Y . Find f Z if (a) f X (x) = f Y (x) =  x/2, if 0 < x < 2, 0, otherwise. (b) f X (x) = f Y (x) =  (1/2)(x − 3), if 3 < x < 5, 0, otherwise. (c) f X (x) =  1/2, if 0 < x < 2, 0, otherwise, 302 CHAPTER 7. SUMS OF RANDOM VARIABLES f Y (x) =  x/2, if 0 < x < 2, 0, otherwise. (d) What can you say about the set E = {z : f Z (z) > 0 } in each case? 4 Let X, Y , and Z be independent random variables with f X (x) = f Y (x) = f Z (x) =  1, if 0 < x < 1, 0, otherwise. Supp ose that W = X + Y + Z. Find f W directly, and compare your answer with that given by the formula in Example 7.9. Hint: See Example 7.3. 5 Suppose that X and Y are independent and Z = X + Y . Find f Z if (a) f X (x) =  λe −λx , if x > 0, 0, otherwise. f Y (x) =  µe −µx , if x > 0, 0, otherwise. (b) f X (x) =  λe −λx , if x > 0, 0, otherwise. f Y (x) =  1, if 0 < x < 1, 0, otherwise. 6 Suppose again that Z = X + Y . Find f Z if f X (x) = 1 √ 2πσ 1 e −(x−µ 1 ) 2 /2σ 2 1 f Y (x) = 1 √ 2πσ 2 e −(x−µ 2 ) 2 /2σ 2 2 . *7 Suppose that R 2 = X 2 + Y 2 . Find f R 2 and f R if f X (x) = 1 √ 2πσ 1 e −(x−µ 1 ) 2 /2σ 2 1 f Y (x) = 1 √ 2πσ 2 e −(x−µ 2 ) 2 /2σ 2 2 . 8 Suppose that R 2 = X 2 + Y 2 . Find f R 2 and f R if f X (x) = f Y (x) =  1/2, if −1 ≤ x ≤ 1, 0, otherwise. 9 Assume that the service time for a customer at a bank is exponentially distributed with mean service time 2 minutes. Let X be the total service time for 10 customers. Estimate the probability that X > 22 minutes. 7.2. SUMS OF CONTINUOUS RANDOM VARIABLES 303 10 Let X 1 , X 2 , . . . , X n be n independent random variables each of which has an exponential density with mean µ. Let M be the minimum value of the X j . Show that the density for M is exponential with mean µ/n. Hint: Use cumulative distribution functions. 11 A company buys 100 lightbulbs, each of which has an exponential lifetime of 1000 hours. What is the expected time for the first of these bulbs to burn out? (See Exercise 10.) 12 An insurance company assumes that the time between claims from each of its homeowners’ policies is exponentially distributed with mean µ. It would like to estimate µ by averaging the times for a number of policies, but this is not very practical since the time between claims is about 30 years. At Galambos’ 5 suggestion the company puts its customers in groups of 50 and observes the time of the first claim within each group. Show that this provides a practical way to estimate the value of µ. 13 Particles are subject to collisions that cause them to split into two parts with each part a fraction of the parent. Suppose that this fraction is uniformly distributed between 0 and 1. Following a single particle through several split- tings we obtain a fraction of the original particle Z n = X 1 ·X 2 ·. . . ·X n where each X j is uniformly distributed between 0 and 1. Show that the density for the random variable Z n is f n (z) = 1 (n − 1)! (−log z) n−1 . Hint: Show that Y k = −log X k is exponentially distributed. Use this to find the density function for S n = Y 1 + Y 2 + ···+Y n , and from this the cumulative distribution and density of Z n = e −S n . 14 Assume that X 1 and X 2 are indep endent random variables, each having an exponential density with parameter λ. Show that Z = X 1 − X 2 has density f Z (z) = (1/2)λe −λ|z| . 15 Suppose we want to test a coin for fairness. We flip the coin n times and record the number of times X 0 that the coin turns up tails and the number of times X 1 = n − X 0 that the coin turns up heads. Now we set Z = 1  i=0 (X i − n/2) 2 n/2 . Then for a fair coin Z has approximately a chi-squared distribution with 2 − 1 = 1 degree of freedom. Verify this by computer simulation first for a fair coin (p = 1/2) and then for a biased coin (p = 1/3). 5 J. Galambos, Introductory Probability Theory (New York: Marcel Dekker, 1984), p. 159. 304 CHAPTER 7. SUMS OF RANDOM VARIABLES 16 Verify your answers in Exercise 2(a) by computer simulation: Choose X and Y from [−1, 1] with uniform density and calculate Z = X + Y . Repeat this experiment 500 times, recording the outcomes in a bar graph on [−2, 2] with 40 bars. Does the density f Z calculated in Exercise 2(a) describe the shape of your bar graph? Try this for Exercises 2(b) and Exercise 2(c), too. 17 Verify your answers to Exercise 3 by computer simulation. 18 Verify your answer to Exercise 4 by computer simulation. 19 The support of a function f (x) is defined to be the set {x : f(x) > 0} . Supp ose that X and Y are two continuous random variables with density functions f X (x) and f Y (y), respectively, and suppose that the supports of these density functions are the intervals [a, b] and [c, d], respectively. Find the support of the density function of the random variable X + Y . 20 Let X 1 , X 2 , . . . , X n be a sequence of independent random variables, all having a common density function f X with supp ort [a, b] (see Exercise 19). Let S n = X 1 + X 2 + ··· + X n , with density function f S n . Show that the support of f S n is the interval [na, nb]. Hint: Write f S n = f S n−1 ∗ f X . Now use Exercise 19 to establish the desired result by induction. 21 Let X 1 , X 2 , . . . , X n be a sequence of independent random variables, all having a common density function f X . Let A = S n /n be their average. Find f A if (a) f X (x) = (1/ √ 2π)e −x 2 /2 (normal density). (b) f X (x) = e −x (exponential density). Hint: Write f A (x) in terms of f S n (x). Chapter 8 Law of Large Numbers 8.1 Law of Large Numbers for Discrete Random Variables We are now in a position to prove our first fundamental theorem of probability. We have seen that an intuitive way to view the probability of a ce rtain outcome is as the frequency with which that outcome occurs in the long run, when the experiment is repeated a large number of times. We have also defined probability mathematically as a value of a distribution function for the random variable rep- resenting the experiment. The Law of Large Numbers, which is a theorem proved about the mathematical model of probability, shows that this model is consistent with the frequency interpretation of probability. This theorem is sometimes called the law of averages. To find out what would happen if this law were not true, see the article by Robert M. Coates. 1 Chebyshev Inequality To discuss the Law of Large Numbers, we first need an important inequality called the Chebyshev Inequality. Theorem 8.1 (Chebyshev Inequality) Let X be a discrete random variable with expected value µ = E(X), and let  > 0 be any positive real number. Then P (|X − µ| ≥ ) ≤ V (X)  2 . Proof. Let m(x) denote the distribution function of X. Then the probability that X differs from µ by at least  is given by P (|X − µ| ≥ ) =  |x−µ|≥ m(x) . 1 R. M. Coates, “The Law,” The World of Mathematics, ed. James R. Newman (New York: Simon and Schuster, 1956. 305 306 CHAPTER 8. LAW OF LARGE NUMBERS We know that V (X) =  x (x − µ) 2 m(x) , and this is clearly at least as large as  |x−µ|≥ (x − µ) 2 m(x) , since all the summands are positive and we have restricted the range of summation in the second sum. But this last sum is at least  |x−µ|≥  2 m(x) =  2  |x−µ|≥ m(x) =  2 P (|X − µ| ≥ ) . So, P (|X − µ| ≥ ) ≤ V (X)  2 . ✷ Note that X in the above theorem can be any discrete random variable, and  any positive number. Example 8.1 Let X by any random variable with E(X) = µ and V (X) = σ 2 . Then, if  = kσ, Chebyshev’s Inequality states that P (|X − µ| ≥ kσ) ≤ σ 2 k 2 σ 2 = 1 k 2 . Thus, for any random variable, the probability of a deviation from the mean of more than k standard deviations is ≤ 1/k 2 . If, for example, k = 5, 1/k 2 = .04. ✷ Chebyshev’s Inequality is the best possible inequality in the sense that, for any  > 0, it is possible to give an example of a random variable for which Chebyshev’s Inequality is in fact an equality. To see this, given  > 0, choose X with distribution p X =  − + 1/2 1/2  . Then E(X) = 0, V (X) =  2 , and P (|X − µ| ≥ ) = V (X)  2 = 1 . We are now prepared to state and prove the Law of Large Numbers. 8.1. DISCRETE RANDOM VARIABLES 307 Law of Large Numbers Theorem 8.2 (Law of Large Numbers) Let X 1 , X 2 , . . . , X n be an independent trials process, with finite expected value µ = E(X j ) and finite variance σ 2 = V (X j ). Let S n = X 1 + X 2 + ··· + X n . Then for any  > 0, P      S n n − µ     ≥   → 0 as n → ∞. Equivalently, P      S n n − µ     <   → 1 as n → ∞. Proof. Since X 1 , X 2 , . . . , X n are independent and have the same distributions, we can apply Theorem 6.9. We obtain V (S n ) = nσ 2 , and V ( S n n ) = σ 2 n . Also we know that E( S n n ) = µ . By Chebyshev’s Inequality, for any  > 0, P      S n n − µ     ≥   ≤ σ 2 n 2 . Thus, for fixed , P      S n n − µ     ≥   → 0 as n → ∞, or equivalently, P      S n n − µ     <   → 1 as n → ∞. ✷ Law of Averages Note that S n /n is an average of the individual outcomes, and one often calls the Law of Large Numbers the “law of averages.” It is a striking fact that we can start with a random experiment about which little can be predicted and, by taking averages, obtain an experiment in which the outcome can be predicted with a high degree of certainty. The Law of Large Numbers, as we have stated it, is often called the “Weak Law of Large Numbers” to distinguish it from the “Strong Law of Large Numbers” described in Exercise 15. [...]... of shaded region 0 z z 0 1 2 3 4 5 6 7 8 9 NA(z) 0000 0398 079 3 1 179 1554 1915 22 57 2580 2881 3159 z NA(z) z NA(z) z NA(z) 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1 .7 1.8 1.9 3413 3643 3849 4032 4192 4332 4452 4554 4641 471 3 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2 .7 2.8 2.9 477 2 4821 4861 4893 4918 4938 4953 4965 4 974 4981 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3 .7 3.8 3.9 49 87 4990 4993 4995 49 97 4998 4998 4999 4999 5000 Figure 9.4:... deviation for the number that accept is √ 170 0 · 6 · 4 ≈ 20 Thus we want to estimate the probability P (S 170 0 > 1060) = P (S 170 0 ≥ 1061) 1060.5 − 1020 ∗ = P S 170 0 ≥ 20 ∗ = P (S 170 0 ≥ 2.025) From Table 9.4, if we interpolate, we would estimate this probability to be 5 − 478 4 = 0216 Thus, the college is fairly safe using this admission policy 2 Applications to Statistics There are many important questions... accidental and fortuitous occurrences we would be bound to recognize, as it were, a certain necessity and, so to speak, a certain fate I do now know whether Plato wished to aim at this in his doctrine of the universal return of things, according to which he predicted that all things will return to their original state after countless ages have past .7 Exercises 1 A fair coin is tossed 100 times The... 1) 3 173 1 1 573 0 08326 04550 02535 01431 00815 00468 00 270 001 57 n 100 200 300 400 500 600 70 0 800 900 1000 319 Chebyshev 1.00000 50000 33333 25000 20000 166 67 14286 12500 11111 10000 Table 8.1: Chebyshev estimates Monte Carlo Method Here is a somewhat more interesting example Example 8 .7 Let g(x) be a continuous function defined for x ∈ [0, 1] with values in [0, 1] In Section 2.1, we showed how to estimate... To four decimal places, the actual value is 0485, and so the approximation is very good 2 The program CLTBernoulliLocal illustrates this approximation for any choice of n, p, and j We have run this program for two examples The first is the probability of exactly 50 heads in 100 tosses of a coin; the estimate is 079 8, while the actual value, to four decimal places, is 079 6 The second example is the probability. .. 1093, while the actual value, to four decimal places, is 1196 330 CHAPTER 9 CENTRAL LIMIT THEOREM The individual binomial probabilities tend to 0 as n tends to infinity In most applications we are not interested in the probability that a specific outcome occurs, but rather in the probability that the outcome lies in a given interval, say the interval [a, b] In order to find this probability, we add the heights... produces heads with probability 3/4 One of the two coins is picked at random, and this coin is tossed n times Let Sn be the number of heads that turns up in these n tosses Does the Law of Large Numbers allow us to predict the proportion of heads that will turn up in the long run? After we have observed a large number of tosses, can we tell which coin was chosen? How many tosses suffice to make us 95 percent... would like to have 1050 freshmen This college cannot accommodate more than 1060 Assume that each applicant accepts with 9.1 BERNOULLI TRIALS 333 probability 6 and that the acceptances can be modeled by Bernoulli trials If the college accepts 170 0, what is the probability that it will have too many acceptances? If it accepts 170 0 students, the expected number of students who matriculate is 6 · 170 0 = 1020... variable with values of [0, 100], mean 70 , and variance 25 (a) Find a lower bound for the probability that the student’s score will fall between 65 and 75 (b) If 100 students take the final, find a lower bound for the probability that the class average will fall between 65 and 75 11 The Pilsdorff beer company runs a fleet of trucks along the 100 mile road from Hangtown to Dry Gulch, and maintains a garage... about the probability that the number of heads that turn up deviates from the expected number 50 by three or more standard deviations (i.e., by at least 15)? 2 Write a program that uses the function binomial(n, p, x) to compute the exact probability that you estimated in Exercise 1 Compare the two results 3 Write a program to toss a coin 10,000 times Let Sn be the number of heads in the first n tosses . York: McGraw-Hill, 19 37) , p. 277 . 300 CHAPTER 7. SUMS OF RANDOM VARIABLES -15 -10 -5 5 10 15 0.025 0.05 0. 075 0.1 0.125 0.15 0. 175 n = 5 n = 10 n = 15 n = 20 n = 25 Figure 7. 7: Convolution of n. in Figure 7. 6. If the X i are distributed normally, with mean 0 and variance 1, then (cf. Exam- ple 7. 5) f X i (x) = 1 √ 2π e −x 2 /2 , 4 J. B. Uspensky, Introduction to Mathematical Probability. a customer at a bank is exponentially distributed with mean service time 2 minutes. Let X be the total service time for 10 customers. Estimate the probability that X > 22 minutes. 7. 2. SUMS

Định dạng
Số trang	51
Dung lượng	524,29 KB