• No results found

7 Sum of Random Variables and the Central Limit Theorem

Random variables of the form

Sn= X1+ X2+ · · · + Xn

appear repeatedly in probability theory and applications. For example, in the insurance context, Sncan represent the total claims paid on all policies where Xiis the ithclaim. Thus, it is useful to be able to determine properties of Sn.

For the expected value of Sn, we have

E(Sn) = E(X1) + E(X2) + · · · + E(Xn−1) + E(Xn).

A similar formula holds for the variance provided that the Xi0s are indepen-dent4 random variables. In this case,

Var(Sn) = Var(X1) + Var(X2) + · · · + Var(Xn).

The central limit theorem reveals a fascinating property of the sum of inde-pendent random variables. It states that the CDF of the sum converges to the standard normal CDF as the number of terms grows without limit. This theorem allows us to use the properties of the standard random variables to obtain accurate estimates of probabilities associated with sums of other random variables.

Theorem 7.1

Let X1, X2, · · · be a sequence of independent and identically distributed random variables, each with mean µ and variance σ2. Then,

P √n

The Central Limit Theorem says that regardless of the underlying distribu-tion of the variables X i, so long as they are independent, the distribution of

n σ

X1+X2+···+Xn

n − µ converges to the same, normal, distribution.

4We say that X and Y are independent random variables if and only if for any two sets of real numbers A and B we have

P (X ∈ A, Y ∈ B) = P (X ∈ A)P (Y ∈ B).

Example 7.1

The weight of a typewriter has a mean of 20 pounds and a variance of 9 pounds. Consider a train that carries 200 of these typewriters. Estimate the probability that the total weight of typewriters carried in the train exceeds 4050 pounds.

Solution.

Label the typewriters as Typewriter 1, Typewriter 2, etc. Let Xi = be the weight of Typewriter i. Thus,

P

In an analysis of healthcare data, ages have been rounded to the nearest multiple of 5 years. The difference between the true age and the rounded age is assumed to be uniformly distributed on the interval from −2.5 years to 2.5 years. The healthcare data are based on a random sample of 48 people.

What is the approximate probability that the mean of the rounded ages is within 0.25 years of the mean of the true ages?

Solution.

Let X denote the difference between true and reported age. We are given X is uniformly distributed on (−2.5, 2.5). That is, X has pdf f (x) = 1/5, −2.5 <

Now X48 the difference between the means of the true and rounded ages, has a distribution that is approximately normal with mean 0 and standard deviation 1.443

48 ≈ 0.2083. Therefore, P

Example 7.3

Let X1, X2, X3, X4 be a random sample of size 4 from a normal distribution with mean 2 and variance 10, and let X be the sample mean. Determine a such that P (X ≤ a) = 0.90.

Solution.

The sample mean X is normal with mean µ = 2 and variance σn2 = 104 = 2.5, and standard deviation √

2.5 ≈ 1.58, so 0.90 = P (X ≤ a) = P X − 2

1.58 < a − 2 1.58



= Φ a − 2 1.58

 . From the normal table, we get a−21.58 = 1.28, so a = 4.02

Practice Problems

Problem 7.1

A shipping agency ships boxes of booklets with each box containing 100 booklets. Suppose that the average weight of a booklet is 1 ounce and the standard deviation is 0.05 ounces. What is the probability that 1 box of booklets weighs more than 100.4 ounces?

Problem 7.2

In the National Hockey League, the standard deviation in the distribution of players’ height is 2 inches. The heights of 25 players selected at random were measured. Estimate the probability that the average height of the players in this sample is within 1 inch of the league average height.

Problem 7.3

A battery manufacturer claims that the lifespan of its betteries has a mean of 54 hours and a standard deviation of 6 hours. A sample of 60 batteries were tested. What is the probability that the mean lifetime is less than 52 hours?

Problem 7.4

Roll a dice 10 times. Estimate the probability that the sum obtained is between 30 and 40, inclusive.

Problem 7.5

Consider 10 independently random variables each uniformly distributed over (0,1). Estimate the probability that the sum of the variables exceeds 6.

Problem 7.6

The Chicago Cubs play 100 independent baseball games in a given season.

Suppose that the probability of winning a game in 0.8. What’s the proba-bility that they win at least 90?

Problem 7.7

An insurance company has 10,000 home policyholders. The average annual claim per policyholder is found to be $240 with a standard deviation of $800.

Estimate the probability that the total annual claim is at least $2.7 million.

Problem 7.8

A certain component is critical to the operation of a laptop and must be replaced immediately upon failure. It is known that the average life of

this type of component is 100 hours and its standard deviation is 30 hours, estimate the number of such components that must be available in stock so that the system remains in continual operation for the next 2000 hours with probability of at least 0.95?

Problem 7.9

An instructoe found that the average student score on class exams is 74 and the standard deviation is 14. This instructor gives two exams: One to a class of size 25 and the other to a class of 64. Using the Central Limit Theorem, estimate the probability that the average test score in the class of size 25 is at least 80.

Problem 7.10

The Salvation Army received 2025 in contributions. Assuming the contri-butions to be independent and identically distributed with mean 3125 and standard deviation 250. Estimate the 90th percentile for the distribution of the total contributions received.

Problem 7.11 ‡

An insurance company issues 1250 vision care insurance policies. The num-ber of claims filed by a policyholder under a vision care insurance policy during one year is a Poisson random variable with mean 2. Assume the numbers of claims filed by distinct policyholders are independent of one an-other.

What is the approximate probability that there is a total of between 2450 and 2600 claims during a one-year period?

Problem 7.12

A battery manufacturer finds that the lifetime of a battery, expressed in months, follows a normal distribution with mean 3 and standard deviation 1 . Suppose that you want to buy a number of these batteries with the intention of replacing them successively into your radio as they burn out.

Assuming that the batteries’ lifetimes are independent, what is the smallest number of batteries to be purchased so that the succession of batteries keeps your radio working for at least 40 months with probability exceeding 0.9772?

Problem 7.13

The total claim amount for a home insurance policy has a pdf f (x) =

 1

1000e1000x x > 0 0 otherwise

An actuary sets the premium for the policy at 100 over the expected total claim amount. If 100 policies are sold, estimate the probability that the insurance company will have claims exceeding the premiums collected?

Problem 7.14 ‡

A city has just added 100 new female recruits to its police force. The city will provide a pension to each new hire who remains with the force until retirement. In addition, if the new hire is married at the time of her re-tirement, a second pension will be provided for her husband. A consulting actuary makes the following assumptions:

(i) Each new recruit has a 0.4 probability of remaining with the police force until retirement.

(ii) Given that a new recruit reaches retirement with the police force, the probability that she is not married at the time of retirement is 0.25.

(iii) The number of pensions that the city will provide on behalf of each new hire is independent of the number of pensions it will provide on behalf of any other new hire.

Determine the probability that the city will provide at most 90 pensions to the 100 new hires and their husbands.

Problem 7.15

The amount of an individual claim has a two-parameters Pareto distribution with θ = 8000 and α = 9. Consider a sample of 500 claims. Estimate the probability that the total sum of the claims is at least 550,000.

Problem 7.16

Suppose that the current profit from selling a share of a stock is found to follow a uniform distribution on [−45, 72]. Using the central limit theorem, approximate the probability of making a profit from the sale of 55 stocks.

Problem 7.17

The severities of individual claims have the Pareto distribution with param-eters α = 83 and θ = 8000. Use the central limit theorem to approximate the probability that the sum of 100 independent claims will exceed 600,000.

8 Moment Generating Functions and Probability