• No results found

Random variables

In document Statistics and Data With R (Page 141-150)

Probability and random variables

4.6 Random variables

Events are associated with probabilities. A function that assigns real values to events associates the events’ probabilities with those real values. Such functions are appro-priately called random variables. From here on, we will use rv to denote both random variable and random variables.

Assigning real values to events lead to rv. The values of the rv inherit the prob-abilities (and operations on these probprob-abilities) of their corresponding events. The links between the values that a rv takes and the probabilities assigned to these values then lead to densities and distributions. These links are illustrated in Figure 4.11. We discussed the link P (E) throughout this chapter—it corresponds to the definition of probability. The remaining links are discussed here.

Figure 4.11 A random variable is a mapping of events to the real line.

Throughout, we use the concept of a real line. The real line is the familiar line that extends from −∞ to ∞. It has an origin at 0 and each point on the line has a

value. The latter reflects the distance of the point from the origin. These values are called real numbers. We will agree that the (extended) real line includes both −∞

and∞.

We define rv thus:

Random variable A function that assigns real numbers to events, including the null event.

We usually denote rv with upper case letters, such as X and Y . As Figure 4.11 illustrates, a rv is a mapping of events to values on the real line. In this context, we say that the sample space, S, is the domain and the real line, R, is the range. We write this as

X (E) : S→ R .

The definition of a rv implies that the assignment of real numbers to events can be arbitrary. Because the definition includes the null event, we are free to assign any real value or range of values to the null event.

4.7 Assignments

Exercise 4.1. Males and females cross a street in no particular order. We note the gender of the first and second people who cross the street. The possible outcomes consist of male first and male second, male first and female second, and so on. Let F and M be the events that a female or a male crossed the street, respectively. Then S

={MM, MF , F M, F F }. Verify that the power set consists of 24= 16 subsets.

Exercise 4.2. This exercise demonstrates DeMorgan’s Laws. Draw a Venn diagram picturing A and B that partially overlap.

1. Shade not (A or B). On a separate diagram, shade (not A) and (not B). Compare the two diagrams.

2. Shade not (A and B). On a separate diagram shade (not A) or (not B). Compare the two diagrams.

Exercise 4.3. An experiment consists of rolling a die and flipping a coin.

1. What is the sample space S? How many outcomes are in the sample space?

2. What are the outcomes of the event E that the side of the die facing up shows an even number of dots?

3. What are the outcomes of the event F that the coin lands on H?

4. What are the outcomes of E∪ F ? 5. What are the outcomes of E∩ F ?

6. Suppose that outcomes are equally likely. Compute:

(a) P (E) (b) P (F ) (c) P (E∪ F ) (d) P (E∩ F )

Exercise 4.4. What is the sample space of an experiment that consists of drawing a card from a standard deck and recording its suit?

Assignments 129 Exercise 4.5. Last semester, I took note of students late to class. The results from 22 students were:

0 2 5 0 3 1 8 0 3 1 1 9 2 4 0 2 9 3 0

1 9 8

1. What proportion of the students was never late?

2. What proportion of the students was late to class at most 8 times? At least 8 times?

3. What proportion of the students was late between 3 and 6 times during the semester?

Exercise 4.6. We record the fate of patients who arrive at a hospital emergency room. The two possible outcomes are the patient is admitted for further treatment or released. To test the hypothesis (we shall do that later) that the fate of two consecutive patients is independent, we choose the first patient at random and record his/her fate.

Then we record the fate of the next arrival.

1. What is the sample space, i.e. the set of all possible outcomes?

2. Show the sample space in a tree diagram.

3. List the outcomes of the event B that at least one patient was released.

4. List the outcomes of the event C that exactly one patient was released.

5. List the outcomes of the event D that none of the patients were released.

6. Which of the events B, C and D is elementary?

7. List the outcomes in the events B and C.

8. List the outcomes in the events B or D.

Exercise 4.7. To test the efficacy of admissions or release, an emergency room embarks on a controlled experiment. Each experiment (there are many of them, but we shall examine only one) consists of choosing 4 patients at random on a particu-lar night. The patients are selected from a group where the doctors are not certain whether they should be admitted or not. Name the patients P1, P2, P3 and P4. Of these 4 patients, we choose 2 at random. The first patient will be released and the second admitted.

1. Display a tree diagram of the possible outcomes.

2. Denote by A the event that at least one of the patients has an even numbered index (P2 and P4have even numbered indices). Which outcomes are included in A?

3. Suppose that P1 and P2 are over 50 years old and P3 and P4 are less than 40 years old. Denote by B the event that exactly one of the patients selected is over 50 years old. Which outcomes are included in B?

Exercise 4.8. Starting at a certain time, you observe deer crossing a road and record their sex (M = male, F = female). The experiment terminates as soon as a male is observed.

1. Give 5 possible experimental outcomes.

2. How many outcomes are there in the sample space?

3. Let E = number of deer observed is even. What outcomes are in E?

Exercise 4.9. The following is a subset of the vital statistics data obtained from WHO (see Example 2.7). The data were collected during 1995 to 2000 and reported in 2003 . Data include the death rate (per 1000 per year) for Eastern African countries only.

18 United Republic of Tanzania 18.1

19 Zambia 28.0

20 Zimbabwe 27.0

A person is picked at random from Eastern Africa.

1. Which country (or countries) could the person have come from if you are told that his probability of dying during the next year is greater than 0.0 167?

2. Which country (or countries) could the person have come from if you are told that his probability of dying during the next year is smaller than 0.0 067?

3. Which country (or countries) could the person have come from if you are told that his probability of dying during the next year is larger than 0.00 167 and smaller than 0.0 181?

Exercise 4.10. All of the terrorists in the 9/11 attack on the Twin Towers came from Middle Eastern Arab countries. The populations of Middle Eastern Arab countries (from the WHO data, see Example 2.7) are as follows (in 1 000):

country pop

1 Bahrain 724

2 Egypt 71931

3 Iran (Islamic Republic of) 68919

4 Iraq 25174

5 Jordan 5472

6 Kuwait 2521

7 Lebanon 3652

8 Libyan Arab Jamahiriya 5550

9 Occupied Palestinian Territory 3557

10 Oman 2851

Assignments 131

11 Saudi Arabia 24217

12 Syrian Arab Republic 17799

13 United Arab Emirates 2994

14 Yemen 20010

Suppose that these terrorists were assembled independently.

1. What is the probability that one of the terrorists came from Saudi Arabia?

2. What is the probability that one of the terrorists came from Saudi Arabia or Egypt?

3. What is the probability that one of the terrorists came from neither Saudi Arabia nor from Egypt?

Exercise 4.11. A single card is randomly selected from a well-mixed deck.

1. How many elementary events are there?

2. What is the probability of an elementary event?

3. What is the probability that the selected card is a diamond? A face card (Jack, Queen or King)?

4. What is the probability that the selected card is both a diamond and a face card?

5. Let A be the event that the selected card is a face and B the event that the selected card is a diamond. What is P (A or B)?

Exercise 4.12. Based on a questionnaire, a matching service finds 4 men and 4 women that match perfectly and are predicted to have a happy marriage. Any incor-rect matching is predicted to result in a failed marriage. In their infinite wisdom, the matching service pairs the customers completely randomly. That is, all outcomes are equally likely. Label the males as A, B, C and D. To simplify the notation, consider one possible outcome: A is paired with B’s perfect match, B is paired with C’s perfect match, C is paired with D’s perfect match and D is paired with A’s perfect match.

We write this outcome as {B, C, D, A}.

1. List the possible outcomes.

2. Consider the event that exactly two of the matchings result in a happy marriage.

List the outcomes contained in this event.

3. What is the probability of this event?

4. What is the probability that exactly one matching results in a happy marriage?

5. What is the probability that exactly three matchings result in happy marriages?

6. What is the probability that at least two of the four matches result in happy marriages?

Exercise 4.13. Five drug addicts are shooting heroin in a crack house. Name them A, B, C, D and E. Each of them is equally likely to die from overdose. Two of them will die by the end of the evening.

1. List the possible outcomes.

2. What is the probability of each elementary event?

3. What is the probability that one of the dead addicts is A?

Exercise 4.14. Of five people in the emergency room (ER) of a certain hospital, A and B are first time patients. For patients C, D and E, it is their second visit to the ER. Two of the five are chosen randomly for treatment by the ER intern.

1. What is the probability that both selected patients are first-time visitors?

2. What is the probability that both selected patients are second-time visitors?

3. What is the probability that at least one of the selected patients is a first-time visitor?

4. What is the probability that of the selected patients, one is a “first-timer” and the other is a “second-timer?”

Exercise 4.15. A patient is seen at a clinic. A recent epidemic in town shows that the probability that the patient suffers from the flu is 0.75. The probability that he suffers from walking pneumonia is 0.55. The probability that he suffers from both is 0.50. Denote by F the event that the patient suffers from the flu and by M that he suffers from pneumonia.

1. Interpret and compute P (F|M).

2. Interpret and compute P (M|F ).

3. Are F and M independent? Explain.

Exercise 4.16. The probability that a randomly selected student on a typical univer-sity campus showered this morning is 0.15. The probability that a randomly selected student on the campus had breakfast this morning is 0.05. The probability that a randomly selected student on the campus both took a shower and had breakfast is 0.009.

1. Given that the student took a shower, what is the probability that he had breakfast as well?

2. If a randomly selected student had breakfast, what is the probability that she also took a shower?

3. Are the events “took a shower” and “had breakfast” independent? Explain.

Exercise 4.17. In the U.S., racial profiling describes the practice of law enforcement agencies to search, stop and sometimes arrest people of a particular ethnic group more than their relative number in the population. Suppose that a population in a certain city is composed of 30% belonging to ethnic group 1 and 70% to ethnic group 2. Members of the ethnic groups are visibly different. Court records reveal that crime rate in group 1 is 25% and in group 2 it is 10%. A police officer stops a person at random. Let E1 be the event that the person belongs to group 1, E2 the event that the person belongs to group 2 and E3 the event that the person is a criminal.

1. What is the probability that the person is a criminal?

2. What is the probability that the person is from A if he is a criminal?

3. What is the probability that the person is from B if she is a criminal?

4. In your opinion, do the results justify racial profiling?

Exercise 4.18. A small pond has 12 fish in it. Seven of them are walleye and five are Northern pike. On a particular day, only two fish are caught. Suppose that the two fish are caught randomly.

1. What is the probability that the first fish caught is a walleye?

2. What is the probability that the second fish caught is a walleye given that the first is a walleye?

3. What is the probability that the first and the second fish caught are walleye?

4. Explain the difference in the probabilities between case 2 and case 3.

Assignments 133

Figure 4.12 Dam gates.

Exercise 4.19. A series of gated dams along two parallel streams is shown in Figure 4.12. Denote E1as the event that gate A functions properly, E2as the event that gate B functions properly and so on. Suppose that P (Ei) = 0.95, i = 1, 2, 3, 4 and that gates function independently. A closed gate is considered to be functioning improperly.

1. What is the probability that water will flow uninterrupted through branch c?

2. What is the probability that water will flow uninterrupted through branch b?

3. What is the probability that water will flow through both branches uninterrupted?

4. What is the probability that water will flow through the system uninterrupted?

Exercise 4.20. To successfully treat a disease, a patient goes through a two-step treatment with clear criteria for successful treatment at each step. Let E denote the event that the first step of the treatment succeeds and F the event that the second step succeeds. The respective probabilities are P (E) = 0.45 and P (F ) = 0.25. The probability that the two-step treatment succeeds is P (E and F ) = 0.20.

1. What is the probability that at least one step of the treatment succeeds?

2. What is the probability that neither step succeeds?

3. What is the probability that exactly one of the two steps succeeds?

4. What is the probability that only the first step succeeds?

Exercise 4.21. Tuberculosis is becoming a global health problem. There are strains of the bacillus that are resistant to antibiotics. Suppose that 0.2% of individuals in a population suffer from tuberculosis. Of those who have the disease, 98% test positive when administered a diagnostic test. Of those who do not have the disease, 85% test

negative when the test is applied. Choose an individual at random and administer the test. Let E be the event that a person has tuberculosis and F the event that the test is positive.

1. Construct a tree diagram with two branches: infected with tuberculosis and not infected. From each of these branches, show two branches: test positive and test negative. Show the appropriate probabilities on each of the four branches.

2. What is P (E and F )?

3. What is P (F )?

4. What is P (E|F )?

Exercise 4.22. At the time of writing, the Minneapolis - St. Paul metropolitan area has 4 major-sport teams: the Vikings (football), the Timberwolves (basketball), the Twins (baseball) and the Wild (hockey). When the teams are successful (they usually are not), game tickets are hard to come by. A scalper (a person who buys tickets at their box-office price and then sells them to the highest bidder—oddly deemed illegal in a capitalistic society) buys 5 tickets to 5 different Vikings games, 4 to 4 different Timberwolves games, 3 to 3 different Twins games and no Wild tickets. He then lets you select 3 tickets randomly.

1. In how many ways can you select one ticket for each team game?

2. In how many ways can you select 3 tickets without regard to the team?

3. If 3 tickets are selected completely randomly, what is the probability that the 3 are for different team games?

Exercise 4.23. You are shopping for a computer system. You have a choice of mon-itor from 6 manufactures, main unit (CPU) from 4 manufacturers and printer from 7 manufacturers. All are about equally priced. How many different system combinations can you assemble?

Exercise 4.24. You suspect that 1 of a 25-cow herd is sick with mad cow disease.

The remaining 24 are healthy. You select one cow at a time and test for the disease.

Once you detect the disease, you stop the experiment. What is the probability that you must examine at least 2 cows?

Exercise 4.25. You are admitted to a hospital for brain surgery. Before submitting to the operation, you wish to have an opinion from two physicians. You obtain a list of 5 physicians, along with their years of practice. The list says that the 5 physicians have been in practice for 2, 5, 7, 9 and 12 years. You choose two physicians randomly.

What is the probability that the chosen two have a total of at least 13 years of practice experience?

Exercise 4.26. Each mouse entering a maze in an experiment can turn left (L), right (R), or go straight (S). The experiment terminates as soon as a mouse goes straight.

Let Y denote the number of mice observed.

1. What are the possible values of Y ?

2. List 5 different outcomes and their associated Y values.

Exercise 4.27. The deepest point in a lake is 100 ft. A point is randomly selected on the surface of the lake. Y = the depth of the lake at the randomly selected point.

What are the possible values of Y ?

Assignments 135 Exercise 4.28. A box contains four chocolate bars marked 1, 2, 3 and 4. Two bars are selected without replacement. Once you select a bar, you receive as many additional bars as the numbers that appear on the bars you select. List the possible values for each of the following random variables:

1. X = the sum of the numbers on the first and second bar

2. Y = the difference between the numbers on the first and second bar 3. Z = the number of bars selected that show an even number

4. W = the number of bars selected that show a 4

Exercise 4.29. During its 6 hour trans-Atlantic flight, it takes an airplane 15 minutes to reach a cruising altitude of 25 000 ft. It takes the airplane 15 minutes to descend from the cruising altitude until landing. Select a random time, T , between take-off and landing. Let X (T ) be the altitude of the plane at T .

1. What are the possible values of T ? 2. What are the possible values of X?

3. Is X a rv? Justify your answer.

4. What is the probability that X = 25 000?

5. In answering (4), do we have to assume that the speed of the plane is approxi-mately constant throughout the flight? Explain.

5

Discrete densities

In document Statistics and Data With R (Page 141-150)