Bài giảng xác suất thống kê

• A discrete random variable: is one that can take on a countable number of values.. • A continuous random variable is one that can take on an uncountable number of values... This is a

Trang 1

Random Variables and Discrete

Probability Distributions

Trang 2

• A probability (assigned to each outcome in the experiment)

• We now add the concept of a random variable and a

probability distribution.

Trang 3

• A discrete random variable: is one that can take on

a countable number of values.

• A continuous random variable is one that can take

on an uncountable number of values.

Trang 4

• Example: Experiment is flipping a coin 10 times, and

let X=# of heads observed in the experiment This is a

discrete random variable, since X can only take on

the values {0,1,2,…,10}, which is finite and therefore countable

• Example: Suppose the experiment is measuring the

time to complete a task, and let X=total time taken.

This is a continuous random variable: Since time is

continuous, the range of values that X can take is a continuum, and therefore uncountable.

Trang 5

• To help understand the distinction between countable and uncountable sets, keep in mind: a) The set of all integer numbers is countable.

b) The set of all real numbers is uncountable (a

continuum).

Trang 6

• A probability distribution describes the values that a random variable can take, along with

Trang 7

• If X is a discrete random variable, its probability

distribution simply represents the probability that X

can take on each one of its possible values.

• We use upper case to denote a random variable,

and we use lower case to denote a particular value that this random variable can take.

• We represent the probability that the random

variable ‘X’ will equal ‘x’ as P(X=x) or, more simply,

P(x).

Trang 8

• As a result of the conditions required of a probability (non‐negative, they must add to 1), the probability distribution P of a discrete random variable must

Trang 14

• We can developing the probability distribution using a Probability Tree.

P(S)=.2

P(S C )=.8

P(S)=.2 P(S)=.2 P(S)=.2

P(S)=.2

P(S C )=.8 P(S C )=.8

P(S C )=.8 P(S)=.2

Trang 15

Population/Probability Distribution …

• The discrete probability distribution describes

a population.

• Since we have populations, we can describe them by computing various parameters.

• Two of the population parameters we studied

previously are: population mean and

population variance.

Trang 16

Random Variable

• Our general definition of the population mean is

• If we know that the random variable X is discrete, we can re‐express µ in terms of the probability

distribution of X. We have:

• This parameter is also called the expected value of X

Trang 17

• The population variance of a discrete random variable can be expressed similarly. It is the weighted average of

the squared deviations from the mean:

• As before, there is a “short‐cut” formulation…

• The standard deviation is the same as before:

Trang 18

) 7 ( 7

) 2 ( 2 )

1 ( 1 )

Trang 20

• There are certain properties of the Expected Value that are useful to know. Let ‘c’ be a constant. Then:

(1) E(c) = c

In words: The expected value of a constant (c) is just the value of the constant

(2) E(X + c) = E(X) + c, and (3) E(cX) = cE(X)

In words: We can “pull” a constant out of the

expected value expression (either as part of a sum

with a random variable X or as a coefficient of random

Trang 21

• Example 7.4: Monthly sales have a mean of

$25,000 and a standard deviation of $4,000.

• Profits are calculated by multiplying sales by 30% and subtracting fixed costs of $6,000

Find the mean monthly profit.

1) Describe the problem statement in algebraic terms:

sales have a mean of $25,000  E(Sales) = 25,000

profits are calculated by… 

Profit = .30(Sales) – 6,000

Trang 22

Find the mean monthly profit.

E(Profit) =E[.30(Sales) – 6,000]

=E[.30(Sales)] – 6,000 [by rule #2]

=.30E(Sales) – 6,000 [by rule #3]

=.30(25,000) – 6,000 = 1,500 Thus, the mean monthly profit is $1,500

Trang 23

3 V(cX) = c 2 V(X)

– In words: The variance of a random variable and a

constant coefficient is the coefficient squared times

the variance of the random variable.

Trang 24

$25,000 and a standard deviation of $4,000. Profits are calculated by multiplying sales by 30% and subtracting

profits are calculated by…  Profit = .30(Sales) – 6,000

Trang 25

2) The variance of profit is = V(Profit)

Trang 26

• Example 7.4 (summary): Monthly sales have a

mean of $25,000 and a standard deviation of

$4,000. Profits are calculated by multiplying sales by 30% and subtracting fixed costs of

Trang 30

• As before, we can calculate the marginal probabilities by

summing across rows and down columns to determine

the probabilities of X and Y individually:

Trang 33

• The (population) coefficient of correlation is calculated in the same way as described

earlier…

Trang 34

• Example 7.6: Compute the covariance and the

Trang 36

P(X+Y=2) = P(0,2) + P(1,1) + P(2,0)

Trang 37

• This is:

Pr(2 ≤ X+Y ≤ 3) = 0.19 + 0.05 = 0.24

Trang 38

.)

YX

Trang 39

Two Random Variables

• Previously, we stated Laws for expected values and variances involving a random variable X and a

constant ‘c’.

• We also have laws involving the sum of two random variables:

1 E(X + Y) = E(X) + E(Y)

2 V(X + Y) = V(X) + V(Y) + 2COV(X, Y)

• If X and Y are independent, COV(X, Y) = 0 and thus (2) becomes:

V(X + Y) = V(X) + V(Y)

Trang 40

marginal distributions of X and Y before. We have:

• We had obtained E(X+Y) and V(X+Y) by deriving the

distribution of X+Y. But we can use the Laws of sums of random variables:

E(X + Y) = E(X) + E(Y) = .7 + .5 = 1.2 V(X + Y) = V(X) + V(Y) + 2COV(X, Y)

= .41 + .45 + 2(‐.15) = .56

Trang 41

Combinations of Two Random Variables

• Let ‘c’ and ‘d’ be two constants. We can generalize the laws of expectation and variance from the sum X+Y to

Trang 45

a) The expected values of the two stocks are

E(R1) = .08 and E(R2) = .15 The weights are w1 = .25 and w2 = .75.

Thus,

E(R2) = w1E(R1) + w2E(R2)

= .25(.08) + .75(.15)

= .1325 (an expected portfolio return of 13.25%)

Trang 46

The standard deviations are σ1 = .12 and σ2 = .22. Thus, V(Rp) = w12 σ12 + w22 σ22 + 2w1w2ρσ1σ2

= (.25 2 )(.12 2 ) + (.75 2 )(.22 2 ) + 2(.25)(.75)ρ (.12)(.22)

= .0281 + .0099 ρ

When ρ = 1

V(Rp) = .0281 + .0099(1) = .0380 When ρ = .5

V(Rp) = .0281 + .0099(.5) = .0331 When ρ = 0

V(Rp) = .0281 + .0099(0) = .0281

Trang 47

• Next, note that the statement of the problem

did not give us the covariance between R1

and R2 directly…

• However, it gave us the standard deviations of R1 and R2, and it asked us to solve the

Trang 48

• Recall that:

• and, therefore:

Trang 49

• Therefore, in the three correlation scenarios to be considered, we have:

Trang 52

• We can extend the formulas that describe the mean and variance of the returns of a portfolio of two

k

1 i j

j i

k

1 i

2 i

2

i 2 w w COV ( R , R ) w

Trang 53

• When k is greater than 2 the calculations can be tedious and time‐consuming.

• For example, when k = 3, we need to know the values of the three weights, three expected

values, three variances, and three covariances.

• When k = 4, there are four expected values, four variances and six covariances. [The number of

covariances required in general is k(k‐1)/2.]

Trang 55

• “Success” and “Failure” are just labels for a

binomial experiment, there is no value judgment implied.

Trang 56

4) The trials are independent  (i.e. the outcome of heads on the first flip will have no impact on subsequent coin flips).

all conditions were met.

Trang 57

• The binomial random variable counts the number of

successes in n trials of the binomial experiment. It can take on values from 0, 1, 2, …, n. Thus, its a

discrete random variable.

• To calculate the probability associated with each

value of X, we use combinatorics:

for x=0, 1, 2, …, n

Trang 59

• Thus, we have that the probability of any outcome that yields ‘x’ successes in ‘n’ trials is:

• In addition, there are a total of

such outcomes

• Thus, adding up the probabilities of all such outcomes,

we obtain the binomial probability formula:

Trang 60

• Example: A quiz consists of 10 multiple‐choice questions. Each question has five possible

answers, only one of which is correct.

• Suppose a student plans to guess the answer to

each question.

• What is the probability that the student gets no answers correct?

• What is the probability that the student gets two answers correct?

Trang 62

• Thus, we have a binomial experiment where

n=10 , and P(success) = .20

• What is the probability that the student gets

no answers correct? This is P(X=0):

The student has about an 11% chance of getting no answers correct

using the guessing strategy.

Trang 63

• What is the probability that the student gets

two answers correct? That is, P(X=2):

Pat has about a 30% chance of getting exactly two answers

correct using the guessing strategy.

Trang 64

• Thus far, we have been using the binomial probability distribution to find probabilities for individual values

Trang 65

• We already know P(0) = .1074 and P(2) = .3020. Using the binomial formula to calculate the others:

P(1) = .2684 , P(3) = .2013, and P(4) = .0881

• We have P(X ≤ 4) = .1074 + .2684 + … + .0881 = .9672

• Thus, its about 97% probable that the student will fail the test using the luck strategy and guessing at

answers…

Trang 66

• Calculating binomial probabilities by hand is tedious and error prone. There is an easier way. Refer to Table 1 in

Trang 70

• We can compute these probabilities from

cumulative probabilities, we explain how next…

Trang 71

• If X is discrete, we can obtain P(X=k) from P(X ≤ k)

and P(X ≤ k‐1) by:

P(X = k) = P(X ≤ k) – P(X ≤ k–1)

• Likewise, for probabilities given as P(X ≥ k), we have:

P(X ≥ k) = 1 – P(X ≤ k–1)

• Finally, we can compute Pr(k1 ≤ X ≤ k2) as:

Trang 72

• Example: Problem 7.93.‐ The leading brand of

dishwasher detergent has a 30% market share. A sample

of 25 dishwasher detergent customers was taken. What

is the probability that 10 of fewer customers chose the leading brand?

• This is an example of a binomial random variable:

X=# of customers who bought leading dishwasher brand

• The underlying experiment consists of:

n=25 trials p=Prob(“Success”)=0.30

• The problem asks for P(X ≤ 10) . Using Table 1 in the

Appendix, we have P(X ≤ 10)=0.9022

Trang 73

• Example: Problem 7.97.‐ It is believed that 10% of all

voters in the United States consider themselves as

“Independent”. A survey asked 25 people to identify themselves as Democrat, Republican or

Trang 74

• Once again, this is an example of a binomial random variable

X = # of Independent voters in the survey

• The underlying experiment consists of

n=25 trials p=Prob(“success”)=0.10

• The problem asks:

a) Pr(X = 0)

b) Pr(X ≤ 4)

c) Pr(X ≥ 3)

Trang 78

random variable X lies within ‘k’ standard deviations of its mean is at least:

• Since we want this to be at least 75%, we first need to find the ‘k’ such that

• This yields .

Trang 79

• Next, recall that in the example we have n=25 and p=0.30

• Therefore, using the expectation and variance formulas for Binomial random variables, we have:

E[X] = n∙p = 7.5 and

• Therefore, an interval that will include, with at least 75% probability, the actual number of customers who will

Trang 80

• Named for Simeon Poisson, the Poisson distribution

is a discrete probability distribution and refers to the

number of events (a.k.a. successes) within a specific time period or region of space.

stretch of highway. (The interval is defined by both time, 1 day, and space, the particular stretch of

highway.)

Trang 81

• Difference between Binomial and Poisson

Random variables:

• A binomial random variable is the number of successes in a given number of trials, whereas

a Poisson random variable is the number of

successes in an interval of time or in a specific region of space.

Trang 83

time period

Trang 85

• Example 7.12: A statistics instructor has observed

that the number of typographical errors in new editions of textbooks varies considerably from

Trang 87

• How to proceed?

• First, note that we are now talking about an interval of 400 pages.

Trang 89

• For a 400 page book, what is the probability that

there are five or less typos?

P(X≤5) = P(0) + P(1) + … + P(5)

• This is rather tedious to solve manually. A better alternative is to refer to Table 2 in Appendix B…

k=5, µ =6, and P(X ≤ k) = .446

“there is about a 45% chance there are 5 or less typos”

Trang 90

• Characterize a range of values that will include the actual number of typos found in a 400 page book with

probability at least 90%.

• Again, we can use Chebysheff’s Theorem. Since we want

this to be at least 75%, we first need to find the ‘k’ such that

• This yields .

• Next, recall that, if X is a Poisson random variable, then

• If X is the number of typos in 400 pages, then

Trang 93

5‐day period.

Trang 94

• This is Pr(X ≥ 3). Again, to use Table 2, we need to express this in terms of a probability of the type Pr(X ≤ k). Note that:

Định dạng
Số trang	95
Dung lượng	1,36 MB