7.1 Introduction and definition
Conditional probability is an important concept that we can use to change a measurement of uncertainty as our information changes.
Example 17. For a randomly selected individual, suppose the probabilities of the four blood types are P (type O) = 0.45, P (type A) = 0.4, P (type B) = 0.1 and P (type AB) = 0.05. A test is taken to determine the blood type, but the test is only able to declare that the blood type is either A or B. What is the probability that the blood type is A?
Definition 18. We define P (E|F ) to be the conditional probability of E given F , where
P (E|F ) := P (E ∩ F )
P (F ) , (3)
assuming P (F ) > 0. We can interpret this to mean
“If it is known that F has occurred, what is the probability that E has also occurred?”
If we know that the outcome belongs to the set F , then for E to occur also, the outcome must lie in the intersection E ∩F . To get the conditional probability
of E|F , we ‘measure’ (using the probability measure P ) the fraction of the set F that is also in the set E.
A comment on joint probabilitites
Definition 18 gives us an intuitive way to think about joint probabilities P (E ∩ F ). Rearranging ((3)) we have
P (E ∩ F ) = P (F )P (E|F ), and we can swap E and F and write
P (F ∩ E) = P (E ∩ F ) = P (E)P (F |E).
This means we can calculate the probability that both E and F occur by considering either
1. the probability that E occurs, and then the probability that F occurs given that E has occurred, or:
2. the probability that F occurs, and then the probability that E occurs given that F has occurred.
Example 18. Visualising blood group example Specifying conditional probabilities directly
Equation (3) tells us how to calculate P (E|F ) if we already know P (E ∩ F ) and P (F ). But in some situations, we may be able to specify P (E|F ) directly, given the information at hand.
Example 19. A diagnostic test has been developed for a particular disease.
In a group of patients known to be carrying the disease, the test successfully detected the disease for 95% of the patients. An individual is selected at
random from the population (and so may or may not be carrying the disease).
Let D be the event that the individual has the disease, and T be the event that the test declares the individual has the disease. Using the frequency approach to specifying a probability and the information above, which of the following probabilities can we specify?
• P (D)
• P (D ∩ T )
• P (T |D)
• P (D|T )
In some cases, the conditional probability will be ‘obvious’, and you should have the confidence just to write it down!
Example 20. 1. A playing card is drawn at random from a standard deck of 52. Let A be the event that the card is a heart, and B be the event that the card is red. What are P (A|B) and P (B|A)?
2. On a National Lottery ticket (6 numbers chosen out of 59), let A be the event that 6th number matches the 6th number drawn, and B be the event that the first 5 numbers on the ticket match the first 5 numbers drawn. What is P (A|B)?
7.2 Independence
Definition 19. Two events E and F are said to be independent if P (E ∩ F ) = P (E)P (F ).
We can use the definition of conditional probability to give a more intuitive definition of independence. The events A and B are independent if
P (E|F ) = P (E). (4)
(If (4) holds, then P (E ∩ F ) = P (E|F )P (F ) = P (E)P (F )). We can read this to mean “If E and F are independent, then learning that E has occurred does not change the probability that F will occur (and vice versa).”
Example 21. Suppose two pregnant women are chosen at random, and con-sider whether each gives birth to a boy or girl (assume there will be no twins, triplets etc.) We can write the sample space as
S = {(boy, boy), (boy, girl), (girl, boy), (girl, girl)},
where, for example, element (boy, girl), is the outcome that the first woman gives birth to a boy, and the second woman gives birth to a girl. Define Bi to be the event that the ith woman gives birth to a boy.
1. With regard to S, what are the subsets B1, B2 and B1 ∩ B2?
2. Suppose we assume that each outcome in the sample space is equally likely. What are the values of P (B1), P (B2) and P (B1 ∩ B2)?
3. Are B1 and B2 independent?
Example 22. Two playing cards are drawn at random from a standard 52 card deck. We observe the number of aces and the number of kings in the two cards drawn. We can write the sample space as
S = {(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (2, 0)},
where, for example, (1, 0) corresponds to observing 1 ace and no kings. Let A be the event of at least one ace, and let K be the event of at least one king.
1. With regards to S, what are the subsets A, K and A ∩ K?
2. Assume that each card in the deck has the same chance of being selected.
Calculate P (A), P (K) and P (A ∩ K).
3. Are A and K independent?
4. Compare P (A) with P (A|K) and comment on the result.
Calculating joint probabilities: a summary We have now seen various ways to calculate a joint probability P (A ∩ B). These are as follows.
1. Assume independence
If we think that learning A has occurred will not change our probability of B occurring (and vice versa) then we have
P (A ∩ B) = P (A)P (B).
Example 23. If I buy a single National Lottery ticket, the probability I don’t win any prize is 53/54. If I buy one ticket every week, what is the probability I win nothing in the first two weeks? What is the probability I win nothing in the first month?
2. Direct calculation using classical probability
When using classical probability (assuming the elements of the sam-ple space are equally likely), it may be straightforward to count in the number of outcomes in which both A and B occur, and hence calculate P (A ∩ B) directly. In a sense, we are saying that if we write C as the event A ∩ B, it is straightforward to calculate P (C).
Example 24. One playing card is drawn at random from a standard deck of 52. Let A be the event that the card is a red face card (king, queen or jack). Let B be the event that the card is a king.
(a) Are A and B independent?
(b) What is P (A ∩ B)?
3. Using conditional probability
As has already been commented, we have
P (A ∩ B) = P (A)P (B|A) = P (B)P (A|B).
If we already know P (A) and P (B|A), or P (B) and P (A|B), we can calculate P (A ∩ B).
Example 25. (Example 19 revisited). Suppose 1% of people in a pop-ulation have a particular disease. A diagnostic test has a 95% chance of detecting the disease in a person carrying the disease. One person is selected at random. What is the probability that they have the disease and the diagnostic test detects it?
Having three methods may seem confusing! In fact, the conditional proba-bility method can be thought of as the way to calculate a joint probaproba-bility (remembering that conditional probabilities can be specified directly), with the first two methods being short cuts or special cases.
Example 26. Show how the conditional probability method can be used to calculate the joint probabilities in Examples 23 and 24.
7.3 The law of total probability
In some situations, calculating a probability of an event is easiest if we first consider some appropriate conditional probabilities.
Example 27. Suppose the four teams in this year’s Champions League semi-finals are Manchester United, Barcelona, Milan, and Bayern Munich. De-pending on their opponents, you judge Manchester United’s probabilities of reaching the final to be
P (reach final|opponent is Barcelona) = 0.2, P (reach final|opponent is Milan) = 0.4, P (reach final|opponent is Bayern Munich) = 0.5.
If the semi-final draw has yet to be made (with any two teams having the same probability of being drawn against each other), what is your probability of Manchester United reaching the final?
Theorem 3. (The law of total probability) Suppose we have a partition E = {E1, . . . , En} of a sample space S. Then for any event F ,
In example 19 we considered a diagnostic test, and the probability of the test detecting the disease in someone who has it. But diagnostic tests can sometimes produce ‘false positives’: a test may claim the presence of the disease in someone who does not have it. In these situations, we will want to know how likely it is someone has the disease, conditional on their test result.
0 0.2 0.4 0.6 0.8 1 how likely each event Ei is, and then how likely F is conditional on Ei.
Example 28. A new diagnostic test has been developed for a particular dis-ease. It is known that 0.1% of people in the population have the disdis-ease. The test will detect the disease in 95% of all people who really do have the disease.
However, there is also the possibility of a “false positive”; out of all people who do not have the disease, the test will claim they do in 2% of cases.
A person is chosen at random to take the test, and the result is “positive”.
How likely is it that that person has the disease?
Theorem 4. (Bayes’ theorem) Suppose we have a partition of E = {E1, . . . , En} of a sample space S. Then for any event F ,
P (Ei|F ) = P (Ei)P (F |Ei) P (F ) .
Note that we can calculate P (F ) via the law of total probability:
Note that if E is a single event then {E, ¯E} is a partition, so Bayes’ theorem gives us
P (E|F ) = P (F |E)P (E)
P (F |E)P (E) + P (F | ¯E)P ( ¯E).
In the context of Bayes’ theorem, we sometimes refer to P (Ei) as the prior probability of Ei, and P (Ei|F ) as the posterior probability of Ei given F . The prior probability states how likely we thought Ei was before we knew that F had occurred, and the posterior probability states how likely we think Ei is after we have learnt that F has occurred.
7.5 Conditional probabilities: deciding which formula to use Initially, it can seem confusing that we have two formulae for conditional probabilities: the definition in equation (3) and Bayes’ theorem (and we have also said that you can sometimes just write down a conditional probability directly). Firstly, you should note that these aren’t really ‘different’ formulae.
To get Bayes’ theorem, we have just started with the conditional probability definition
P (A|B) = P (A ∩ B) P (B) ,
then rewritten the numerator using P (A ∩ B) = P (A)P (B|A) and rewritten the denominator using the law of total probability. With practice, you will quickly learn to recognise what probabilities you already know, and so how
to calculate P (A|B). However, to start with, you may find it helpful to use the following scheme.
1. Is it ‘obvious’?
You may have the information you need to write down P (A|B) directly.
If not, carry on to Step 2.
2. Write down the conditional probability formula P (A|B) = P (A ∩ B)
P (B) 3. Consider the numerator P (A ∩ B)
(a) Is it straightforward to write down or calculate P (A ∩ B)? If so, move on to Step 4.
(b) Do you know P (A) and P (B|A)? If so, you can calculate P (A ∩ B) P (A ∩ B) = P (A)P (B|A).
(In this case, you are now using Bayes’ Theorem).
4. Consider the denominator P (B)
If you know this value already, you’re done. If not, try the law of total probability:
P (B) = P (A)P (B|A) + P ( ¯A)P (B| ¯A).
Try working through Examples 17, 20 and 28 again, following the scheme above.