Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 95 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
95
Dung lượng
1,36 MB
Nội dung
Chapter Seven Random Variables and Discrete Probability Distributions 352 Random Variable and Probability Distribution • In the background, there is a random experiment. As we discussed, accompanying this experiment, we have: • A Sample Space (all possible outcomes of the experiment) • A probability (assigned to each outcome in the experiment) • We now add the concept of a random variable and a probability distribution 353 • Random Variable: A random variable is a function that assigns a number to each outcome of the experiment. • There are two main types of random variables: • Discrete Random Variables • Continuous Random Variables The distinction basically depends on the range of values that the random variable can take • A discrete random variable: is one that can take on a countable number of values. • A continuous random variable is one that can take on an uncountable number of values 354 • Example: Experiment is flipping a coin 10 times, and let X=# of heads observed in the experiment This is a discrete random variable, since X can only take on the values {0,1,2,…,10}, which is finite and therefore countable • Example: Suppose the experiment is measuring the time to complete a task, and let X=total time taken. This is a continuous random variable: Since time is continuous, the range of values that X can take is a continuum, and therefore uncountable. 355 • To help understand the distinction between countable and uncountable sets, keep in mind: a) The set of all integer numbers is countable b) The set of all real numbers is uncountable (a continuum) 356 Probability Distribution • A probability distribution describes the values that a random variable can take, along with the probability associated with each value. • A probability distribution can be summarized by a table, a formula or a graph. • In Chapter 6 we focus on the probability distribution of a discrete random variable 357 • If X is a discrete random variable, its probability distribution simply represents the probability that X can take on each one of its possible values • We use upper case to denote a random variable, and we use lower case to denote a particular value that this random variable can take. • We represent the probability that the random variable ‘X’ will equal ‘x’ as P(X=x) or, more simply, P(x). 358 Requirements of a Discrete Probability Distribution • As a result of the conditions required of a probability (non‐negative, they must add to 1), the probability distribution P of a discrete random variable must satisfy: • Where the notation Denotes the sum of P(x) over all possible values ‘x’ that the random variable X can take 359 • Example 7.1: The Statistical Abstract of the United States is published annually. It contains a wide variety of information based on the census as well as other sources. • Its goal is to provide information about a variety of different aspects of the lives of the country’s residents. • One of the questions asks households to report the number of persons living in the household. The following table summarizes the data. • Develop the probability distribution of the random variable defined as the number of persons per household 360 Number of Persons 7 or more Number of Households (millions) 31.1 38.6 18.8 16.2 7.2 2.7 1.4 Total 116.0 361 • Difference between Binomial and Poisson Random variables: • A binomial random variable is the number of successes in a given number of trials, whereas a Poisson random variable is the number of successes in an interval of time or in a specific region of space 432 The Poisson Experiment • Like a binomial experiment, a Poisson experiment has four defining characteristic properties: i The number of successes that occur in any interval is independent of the number of successes that occur in any other interval ii The probability of a success in an interval is the same for all equal‐size intervals iii The probability of a success is proportional to the size of the interval. iv The probability of more than one success in an interval approaches 0 as the interval becomes smaller 433 The Poisson random variable is the number of successes that occur in a period of time or an interval of space in a Poisson experiment successes E.g. On average, 96 trucks arrive at a border crossing every hour time period E.g. The number of typographic errors in a new textbook edition averages 1.5 per 100 pages successes (?!) interval 434 Poisson Probability Distribution • The probability that a Poisson random variable assumes a value of x is given by: and e is the natural logarithm base • The expected value and variance of a Poisson random variable X are given by: 435 • Example 7.12: A statistics instructor has observed that the number of typographical errors in new editions of textbooks varies considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 per 100 pages. • The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos? 436 • That is, what is P(X=0) given that µ = 1.5? “There is about a 22% chance of finding zero errors” • Suppose that the instructor has just received a copy of a new statistics book. He notices that there are 400 pages a) What is the probability that there are no typos? b) What is the probability that there are five or fewer typos? 437 • How to proceed? • First, note that we are now talking about an interval of 400 pages. • In the original statement of the problem, we were told that the expected number of typos in an interval of 100 pages was µ = 1.5 • Therefore, the expected number of typos in an interval of 400 pages is 4*1.5 = 6 • Thus, when we deal with an interval of 400 pages, we must use µ = 6 in the Poisson distribution formula 438 • For a 400 page book, what is the probability that there are no typos? P(X=0) = “there is a very small chance there are no typos” 439 • For a 400 page book, what is the probability that there are five or less typos? P(X≤5) = P(0) + P(1) + … + P(5) • This is rather tedious to solve manually. A better alternative is to refer to Table 2 in Appendix B… k=5, µ =6, and P(X ≤ k) = .446 “there is about a 45% chance there are or less typos” 440 • Characterize a range of values that will include the actual number of typos found in a 400 page book with probability at least 90% • Again, we can use Chebysheff’s Theorem. Since we want this to be at least 75%, we first need to find the ‘k’ such that • This yields • Next, recall that, if X is a Poisson random variable, then ). E[X]=µ and V(X)= µ (and therefore, • If X is the number of typos in 400 pages, then and 441 • Thus, the interval is: [6 – 3.162*2.449 , 6+3.162*2.449] = [‐1.743 , 13.743] • We cannot have a negative number of typos, so we can truncate the interval at zero • Thus, we can state that “with probability at least 90%, there will be between 0 and 13 typos in a 400‐page book”. 442 Poisson Distribution in Excel • Excel can compute Pr(X = x) and Pr(X ≤ x) when X is a Poisson random variable • The command is: POISSON.DIST(x, mean, cumulative) • Where: • x: The number of events (“successes”) • mean : The expected number of successes per interval • Cumulative : If cumulative is TRUE, then POISSON.DIST returns the cumulative distribution function Pr(X ≤ x); if FALSE, it returns the probability function Pr(X = x) 443 • Example: Exercise 7.117: The number of bank robberies that occur in a large North American city is Poisson distributed with a mean of 1.5 per day Find the probabilities of the following events: a) Three or more bank robberies occur in a day a) Between 10 and 15 robberies occur during a 5‐day period. 444 • This is Pr(X ≥ 3). Again, to use Table 2, we need to express this in terms of a probability of the type Pr(X ≤ k). Note that: Pr(X ≥ 3) = 1‐Pr(X ≤ 2) where X is a Poisson random variable with µ=1.5 • This is Pr(10 ≤ X ≤ 15). We have: Pr(10 ≤ X ≤ 15) = Pr(X ≤ 15) ‐ Pr(X ≤ 9) where X is Poisson with µ=1.8*5 = 7.5 445 • Using Table 2 in Appendix B, if µ=1.5, then Pr(X ≤ 2) = 0.8088 and therefore Pr(X ≥ 3) = 1‐Pr(X ≤ 2) = 0.1912 • And if µ=7.5, then Pr(X ≤ 15) = 0.9954 and Pr(X ≤ 9) = 0.7764 and therefore Pr(10 ≤ X ≤ 15) = Pr(X ≤ 15) ‐ Pr(X ≤ 9) = 0.219 446