Since it is true thatπ΄ βͺ π΄ = πand thatπ΄andπ΄are mutually exclusive, it follows by Axioms 2 and 3 thatπ (π΄) + π(
π΄)
= π (π) = 1, or π(
π΄)
= 1 β π (π΄) (6.3)
A generalization of Axiom 3 to events that are not mutually exclusive is obtained by noting thatπ΄ βͺ π΅ = π΄ βͺ(
π΅ β© π΄)
, whereπ΄andπ΅ β© π΄are disjoint (this is most easily seen by using a Venn diagram). Therefore, Axiom 3 can be applied to give
π (π΄ βͺ π΅) = π (π΄) + π (π΅ β© π΄) (6.4)
Similarly, we note from a Venn diagram that the eventsπ΄ β© π΅andπ΅ β© π΄are disjoint and that (π΄ β© π΅) βͺ(
π΅ β© π΄)
= π΅so that
π (π΄ β© π΅) + π (π΅ β© π΄) = π (π΅) (6.5)
Solving forπ (π΅ β© π΄)from(6.5)and substituting into(6.4)yields the following forπ (π΄ βͺ π΅):
π (π΄ βͺ π΅) = π (π΄) + π (π΅) β π (π΄ β© π΅) (6.6) This is the desired generalization of Axiom 3.
Now consider two eventsπ΄andπ΅, with individual probabilitiesπ (π΄) > 0andπ (π΅) > 0, respectively, and joint event probabilityπ (π΄ β© π΅). We define theconditional probability of
2This can be generalized toπ (π΄ βͺ π΅ βͺ πΆ) = π (π΄) + π (π΅) + π (πΆ)forπ΄,π΅, andπΆmutually exclusive by consider- ingπ΅1= π΅ βͺ πΆto be a composite event in Axiom 3 and applying Axiom 3 twice: i.e.,π(
π΄ βͺ π΅1)
= π (π΄) + π (π΅1) = π (π΄) + π (π΅) + π (πΆ). Clearly, in this way we can generalize this result to any finite number of mutually exclusive events.
eventπ΄given that eventπ΅occurred as
π (π΄|π΅) = π (π΄ β© π΅)
π (π΅) (6.7)
Similarly, the conditional probability of eventπ΅given that eventπ΄has occurred is defined as π (π΅|π΄) = π (π΄ β© π΅)
π (π΄) (6.8)
Putting Equations(6.7)and(6.8)together, we obtain
π (π΄|π΅) π (π΅) = π (π΅|π΄) π (π΄) (6.9)
or
π (π΅|π΄) = π (π΅) π (π΄|π΅)
π (π΄) (6.10)
This is a special case ofBayesβ rule.
Finally, suppose that the occurrence or nonoccurrence of π΅ in no way influences the occurrence or nonoccurrence ofπ΄. If this is true,π΄andπ΅are said to bestatistically indepen- dent. Thus, if we are givenπ΅, this tells us nothing aboutπ΄and therefore,π (π΄|π΅) = π (π΄).
Similarly,π (π΅|π΄) = π (π΅). From Equation(6.7)or(6.8)it follows that, for such events,
π (π΄ β© π΅) = π (π΄)π (π΅) (6.11)
Equation(6.11)will be taken as the definition of statistically independent events.
EXAMPLE 6.3
Referring to Example 6.2, supposeπ΄denotes at least one head andπ΅denotes a match. The sample space is shown in Figure 6.1(b). To findπ (π΄)andπ (π΅), we may proceed in several different ways.
S o l u t i o n
First, if we use equal likelihood, there are three outcomes favorable toπ΄(that is, HH, HT, and TH) among four possible outcomes, yieldingπ (π΄) =34. Forπ΅, there are two favorable outcomes in four possibilities, givingπ (π΅) =12.
As a second approach, we note that, if the coins do not influence each other when tossed, the outcomes on separate coins are statistically independent withπ (π») = π (π ) =12. Also, eventπ΄consists of any of the mutually exclusive outcomes HH, TH, and HT, giving
π (π΄) =(1 2β 1
2 )+(1
2β 1 2
)+(1 2β 1
2 )= 3
4 (6.12)
by(6.11)and Axiom 3, generalized. Similarly, sinceπ΅consists of the mutually exclusive outcomes HH and TT,
π (π΅) =(1 2β 1
2 )+(1
2β 1 2
)= 1
2 (6.13)
again through the use of(6.11) and Axiom 3. Also,π (π΄ β© π΅) = π(at least one head and a match)
= π (HH) = 14.
Next, consider the probability of at least one head given a match,π (π΄|π΅). Using Bayesβ rule, we obtain
π (π΄|π΅) =π (π΄ β© π΅) π (π΅) =
1 4 1 2
= 12 (6.14)
which is reasonable, since givenπ΅, the only outcomes under consideration are HH and TT, only one of which is favorable to eventπ΄. Next, findingπ (π΅|π΄), the probability of a match given at least one head, we obtain
π (π΅|π΄) =π (π΄ β© π΅) π (π΄) =
1 4 3 4
= 13 (6.15)
Checking this result using the principle of equal likelihood, we have one favorable event among three candidate events (HH, TH, and HT), which yields a probability of 13. We note that
π (π΄ β© π΅)β π (π΄)π (π΅) (6.16)
Thus, eventsπ΄andπ΅are not statistically independent, although the events H and T on either coin are independent.
Finally, consider the joint probabilityπ (π΄ βͺ π΅). Using(6.6), we obtain π (π΄ βͺ π΅) =3
4+ 1 2β 1
4= 1 (6.17)
Remembering thatπ (π΄ βͺ π΅)is the probability of at least one head, or a match, or both, we see that this includes all possible outcomes, thus confirming the result.
β
EXAMPLE 6.4
This example illustrates the reasoning to be applied when trying to determine if two events are indepen- dent. A single card is drawn at random from a deck of cards. Which of the following pairs of events are independent? (a) The card is a club, and the card is black. (b) The card is a king, and the card is black.
S o l u t i o n
We use the relationship π (π΄ β© π΅) = π (π΄|π΅)π (π΅) (always valid) and check it against the relation π (π΄ β© π΅) = π (π΄)π (π΅)(valid only for independent events). For part (a), we letπ΄be the event that the card is a club andπ΅be the event that it is black. Since there are 26 black cards in an ordinary deck of cards, 13 of which are clubs, the conditional probabilityπ (π΄ β£ π΅)is13
26(given we are considering only black cards, we have 13 favorable outcomes for the card being a club). The probability that the card is black isπ (π΅) =2652, because half the cards in the 52-card deck are black. The probability of a club (eventπ΄), on the other hand, isπ (π΄) =1352(13 cards in a 52-card deck are clubs). In this case,
π (π΄|π΅)π (π΅) =13 26
26
52 β π (π΄)π (π΅) = 13 52
26
52 (6.18)
so the events are not independent.
For part (b), we letπ΄be the event that a king is drawn, and eventπ΅ be that it is black. In this case, the probability of a king given that the card is black isπ (π΄|π΅) = 262 (two cards of the 26 black cards are kings). The probability of a king is simplyπ (π΄) =524 (four kings in the 52-card deck) and π (π΅) = π (black) = 2652. Hence,
π (π΄|π΅)π (π΅) = 2 26
26
52 = π (π΄)π (π΅) = 4 52
26
52 (6.19)
which shows that the events king and black are statistically independent.
β
EXAMPLE 6.5
As an example more closely related to communications, consider the transmission of binary digits through a channel as might occur, for example, in computer networks. As is customary, we denote the two possible symbols as 0 and 1. Let the probability of receiving a zero, given a zero was sent,π (0π|0π ), and the probability of receiving a 1, given a 1 was sent,π (1π|1π ), be
π (0π|0π ) = π (1π|1π ) = 0.9 (6.20)
Thus, the probabilitiesπ (1π|0π )andπ (0π|1π )must be
π (1π|0π ) = 1 β π (0π|0π ) = 0.1 (6.21)
and
π (0π|1π ) = 1 β π (1π|1π ) = 0.1 (6.22)
respectively. These probabilities characterize the channel and would be obtained through experimental measurement or analysis. Techniques for calculating them for particular situations will be discussed in Chapters 9 and 10.
In addition to these probabilities, suppose that we have determined through measurement that the probability of sending a 0 is
π (0π ) = 0.8 (6.23)
and therefore the probability of sending a 1 is
π (1π ) = 1 β π (0π ) = 0.2 (6.24)
Note that onceπ (0π|0π ),π (1π|1π ), andπ (0π )are specified, the remaining probabilities are calculated using Axioms 2 and 3.
The next question we ask is, ββIf a 1 was received, what is the probability,π (1π |1π), that a 1 was sent?ββ Applying Bayesβ rule, we find that
π (1π |1π) =π (1π|1π )π (1π )
π (1π) (6.25)
To findπ (1π), we note that
π (1π, 1π ) = π (1π|1π )π (1π ) = 0.18 (6.26) and
π (1π, 0π ) = π (1π|0π )π (0π ) = 0.08 (6.27) Thus,
π (1π) = π (1π, 1π ) + π (1π, 0π ) = 0.18 + 0.08 = 0.26 (6.28) and
π (1π |1π) =(0.9)(0.2)
0.26 = 0.69 (6.29)
Similarly, one can calculateπ (0π |1π) = 0.31,π (0π |0π) = 0.97, andπ (1π |0π) = 0.03. For practice, you should go through the necessary calculations.
β