• No results found

Continuous Distributions

4.3 Central Limit Theorem

We now turn our attention to sums of random variables, Sn= X1+ . . . + Xn,

that appear in many applications. Let µ = E(Xi) and σ = Std(Xi) for all i = 1, . . . , n.

How does Sn behave for large n?

Matlab Demo. The following MATLAB code is a good illustration to the behavior of partial sums Sn.

S(1)=0;

for n=2:1000; S(n)=S(n-1)+randn; end; % nth partial sum n=1:1000; comet(n,S); pause(3); % Behavior of S(n) comet(n,S./n); pause(3); % Behavior of S(n)/n comet(n,S./sqrt(n)); pause(3); % Behavior of S(n)/√n Apparently (MATLAB users can see it from the obtained graphs),

ˆ The pure sum Sn diverges. In fact, this should be anticipated because Var(Sn) = nσ2→ ∞,

so that variability of Sn grows unboundedly as n goes to infinity.

ˆ The average Sn/n converges. Indeed, in this case, we have

Var(Sn/n) = Var(Sn)/n2= nσ2/n2= σ2/n→ 0, so that variability of (Sn/n) vanishes as n→ ∞.

ˆ An interesting normalization factor is 1/√n. For µ = 0, we can see from MATLAB simulations that Sn/√n neither diverges nor converges! It does not tend to leave 0, but it does not converge to 0 either. Rather, it behaves like some random variable. The following theorem states that this variable has approximately Normal distribution for large n.

Theorem 1 (Central Limit Theorem) Let X1, X2, . . . be independent random variables with the same expectation µ = E(Xi) and the same standard deviation σ = Std(Xi), and let

Sn= Xn i=1

Xi= X1+ . . . + Xn. As n→ ∞, the standardized sum

Zn= Sn− E(Sn)

Std(Sn) =Sn− nµ σ√n

converges in distribution to a Standard Normal random variable, that is, FZn(z) = P

This theorem is very powerful because it can be applied to random variables X1, X2, . . . having virtually any thinkable distribution with finite expectation and variance. As long as n is large (the rule of thumb is n > 30), one can use Normal distribution to compute probabilities about Sn.

Theorem 1 is only one basic version of the Central Limit Theorem. Over the last two centuries, it has been extended to large classes of dependent variables and vectors, stochastic processes, and so on.

Example 4.13 (Allocation of disk space). A disk has free space of 330 megabytes.

Is it likely to be sufficient for 300 independent images, if each image has expected size of 1 megabyte with a standard deviation of 0.5 megabytes?

Solution. We have n = 300, µ = 1, σ = 0.5. The number of images n is large, so the Central Limit Theorem applies to their total size Sn. Then,

P{sufficient space} = P {Sn≤ 330} = P

This probability is very high, hence, the available disk space is very likely to be sufficient.

In the special case of Normal variables X1, X2, . . ., the distribution of Sn is always Normal, and (4.18) becomes exact equality for arbitrary, even small n.

Example 4.14 (Elevator). You wait for an elevator, whose capacity is 2000 pounds.

The elevator comes with ten adult passengers. Suppose your own weight is 150 lbs, and you heard that human weights are normally distributed with the mean of 165 lbs and the standard deviation of 20 lbs. Would you board this elevator or wait for the next one?

Solution. In other words, is overload likely? The probability of an overload equals P{S10+ 150 > 2000} = P

So, with probability 0.9992 it is safe to take this elevator. It is now for you to decide.

Among the random variables discussed in Chapters 3 and 4, at least three have a form of Sn:

Binomial variable = sum of independent Bernoulli variables Negative Binomial variable = sum of independent Geometric variables Gamma variable = sum of independent Exponential variables Hence, the Central Limit Theorem applies to all these distributions with sufficiently large n in the case of Binomial, k for Negative Binomial, and α for Gamma variables.

In fact, Abraham de Moivre (1667–1754) obtained the first version of the Central Limit Theorem as the approximation of Binomial distribution.

Normal approximation to Binomial distribution

Binomial variables represent a special case of Sn = X1+. . .+Xn, where all Xihave Bernoulli distribution with some parameter p. We know from Section 3.4.5 that small p allows to approximate Binomial distribution with Poisson, and large p allows such an approximation for the number of failures. For the moderate values of p (say, 0.05≤ p ≤ 0.95) and for large n, we can use Theorem 1:

Binomial(n, p)≈ Normal

µ = np, σ =p

np(1− p)

(4.19) Matlab Demo. To visualize how Binomial distribution gradually takes the shape of Normal distribution when n → ∞, you may execute the following MATLAB code that graphs Binomial(n, p) pmf for increasing values of n.

for n=1:2:100; x=0:n;

p=gamma(n+1)./gamma(x+1)./gamma(n-x+1).*(0.3).bx.*(0.7).b(n-x);

plot(x,p);

title(’Binomial probabilities for increasing values of n’);

pause(1); end;

Continuity correction

This correction is needed when we approximate a discrete distribution (Binomial in this case) by a continuous distribution (Normal). Recall that the probability P{X = x} may be positive if X is discrete, whereas it is always 0 for continuous X. Thus, a direct use of (4.19) will always approximate this probability by 0. It is obviously a poor approximation.

This is resolved by introducing a continuity correction. Expand the interval by 0.5 units in each direction, then use the Normal approximation. Notice that

PX(x) = P{X = x} = P {x − 0.5 < X < x + 0.5}

is true for a Binomial variable X; therefore, the continuity correction does not change the event and preserves its probability. It makes a difference for the Normal distribution, so every time when we approximate some discrete distribution with some continuous distribution, we should be using a continuity correction. Now it is the probability of an interval instead of one number, and it is not zero.

Example 4.15. A new computer virus attacks a folder consisting of 200 files. Each file gets damaged with probability 0.2 independently of other files. What is the probability that fewer than 50 files get damaged?

Solution. The number X of damaged files has Binomial distribution with n = 200, p = 0.2, µ = np = 40, and σ =p

np(1− p) = 5.657. Applying the Central Limit Theorem with the continuity correction,

P{X < 50} = P {X < 49.5} = P

X− 40

5.657 < 49.5− 40 5.657



= Φ(1.68) = 0.9535.

Notice that the properly applied continuity correction replaces 50 with 49.5, not 50.5. In-deed, we are interested in the event that X is strictly less than 50. This includes all values up to 49 and corresponds to the interval [0, 49] that we expand to [0, 49.5]. In other words, events{X < 50} and {X < 49.5} are the same; they include the same possible values of X.

Events{X < 50} and {X < 50.5} are different because the former includes X = 50, and the latter does not. Replacing {X < 50} with {X < 50.5} would have changed its probability

and would have given a wrong answer. ♦

When a continuous distribution (say, Gamma) is approximated by another continuous dis-tribution (Normal), the continuity correction is not needed. In fact, it would be an error to use it in this case because it would no longer preserve the probability.

Summary and conclusions

Continuous distributions are used to model various times, sizes, measurements, and all other random variables that assume an entire interval of possible values.

Continuous distributions are described by their densities that play a role analogous to probability mass functions of discrete variables. Computing probabilities essentially reduces to integrating a density over the given set. Expectations and variances are defined similarly to the discrete case, replacing a probability mass function by a density and summation by integration.

In different situations, one uses Uniform, Exponential, Gamma, or Normal distributions. A few other families are studied in later chapters.

The Central Limit Theorem states that a standardized sum of a large number of indepen-dent random variables is approximately Normal, thus Table A4 can be used to compute related probabilities. A continuity correction should be used when a discrete distribution is approximated by a continuous distribution.

Characteristics of continuous families are summarized in Section 12.1.2.

Exercises

4.1. The lifetime, in years, of some electronic component is a continuous random variable with the density

f (x) =



 k

x4 for x≥ 1 0 for x < 1.

Find k, the cumulative distribution function, and the probability for the lifetime to exceed 2 years.

4.2. The time, in minutes, it takes to reboot a certain system is a continuous variable with the density

f (x) =

 C(10− x)2, if 0 < x < 10

0, otherwise

(a) Compute C.

(b) Compute the probability that it takes between 1 and 2 minutes to reboot.

4.3. The installation time, in hours, for a certain software module has a probability density function f (x) = k(1− x3) for 0 < x < 1. Find k and compute the probability that it takes less than 1/2 hour to install this module.

4.4. Lifetime of a certain hardware is a continuous random variable with density f (x) =

 K− x/50 for 0 < x < 10 years 0 for all other x (a) Find K.

(b) What is the probability of a failure within the first 5 years?

(c) What is the expectation of the lifetime?

4.5. Two continuous random variables X and Y have the joint density f (x, y) = C(x2+ y), −1 ≤ x ≤ 1, 0 ≤ y ≤ 1.

(a) Compute the constant C.

(b) Find the marginal densities of X and Y . Are these two variables independent?

(c) Compute probabilities P{Y < 0.6} and P {Y < 0.6 | X < 0.5}.

4.6. A program is divided into 3 blocks that are being compiled on 3 parallel computers. Each block takes an Exponential amount of time, 5 minutes on the average, independently of other blocks. The program is completed when all the blocks are compiled. Compute the expected time it takes the program to be compiled.

4.7. The time it takes a printer to print a job is an Exponential random variable with the expectation of 12 seconds. You send a job to the printer at 10:00 am, and it appears to be third in line. What is the probability that your job will be ready before 10:01?

4.8. For some electronic component, the time until failure has Gamma distribution with param-eters α = 2 and λ = 2 (years−1). Compute the probability that the component fails within the first 6 months.

4.9. On the average, a computer experiences breakdowns every 5 months. The time until the first breakdown and the times between any two consecutive breakdowns are independent Exponential random variables. After the third breakdown, a computer requires a special maintenance.

(a) Compute the probability that a special maintenance is required within the next 9 months.

(b) Given that a special maintenance was not required during the first 12 months, what is the probability that it will not be required within the next 4 months?

4.10. Two computer specialists are completing work orders. The first specialist receives 60% of all orders. Each order takes her Exponential amount of time with parameter λ1= 3 hrs−1. The second specialist receives the remaining 40% of orders. Each order takes him Exponen-tial amount of time with parameter λ2= 2 hrs−1.

A certain order was submitted 30 minutes ago, and it is still not ready. What is the proba-bility that the first specialist is working on it?

4.11. Consider a satellite whose work is based on a certain block A. This block has an independent backup B. The satellite performs its task until both A and B fail. The lifetimes of A and B are exponentially distributed with the mean lifetime of 10 years.

(a) What is the probability that the satellite will work for more than 10 years?

(b) Compute the expected lifetime of the satellite.

4.12. A computer processes tasks in the order they are received. Each task takes an Exponential amount of time with the average of 2 minutes. Compute the probability that a package of 5 tasks is processed in less than 8 minutes.

4.13. On the average, it takes 25 seconds to download a file from the internet. If it takes an Exponential amount of time to download one file, then what is the probability that it will take more than 70 seconds to download 3 independent files?

4.14. The time X it takes to reboot a certain system has Gamma distribution with E(X) = 20 min and Std(X) = 10 min.

(a) Compute parameters of this distribution.

(b) What is the probability that it takes less than 15 minutes to reboot this system?

4.15. A certain system is based on two independent modules, A and B. A failure of any module causes a failure of the whole system. The lifetime of each module has a Gamma distribution, with parameters α and λ given in the table,

Component α λ (years−1)

A 3 1

B 2 2

(a) What is the probability that the system works at least 2 years without a failure?

(b) Given that the system failed during the first 2 years, what is the probability that it failed due to the failure of component B (but not component A)?

4.16. Let Z be a Standard Normal random variable. Compute

(a) P (Z < 1.25) (b) P (Z≤ 1.25) (c) P (Z > 1.25) (d) P (|Z| ≤ 1.25) (e) P (Z < 6.0) (f) P (Z > 6.0) (g) With probability 0.8, variable Z does not exceed what value?

4.17. For a Standard Normal random variable Z, compute

(a) P (Z≥ 0.99) (b) P (Z ≤ −0.99) (c) P (Z < 0.99) (d) P (|Z| > 0.99) (e) P (Z < 10.0) (f) P (Z > 10.0) (g) With probability 0.9, variable Z is less than what?

4.18. For a Normal random variable X with E(X) =−3 and Var(X) = 4, compute (a) P (X≤ 2.39) (b) P (Z≥ −2.39) (c) P (|X| ≥ 2.39) (d) P (|X + 3| ≥ 2.39) (e) P (X < 5) (f) P (|X| < 5) (g) With probability 0.33, variable X exceeds what value?

4.19. According to one of the Western Electric rules for quality control, a produced item is considered conforming if its measurement falls within three standard deviations from the target value. Suppose that the process is in control so that the expected value of each measurement equals the target value. What percent of items will be considered conforming, if the distribution of measurements is

(a) Normal(µ, σ)?

(b) Uniform(a, b)?

4.20. Refer to Exercise 4.19. What percent of items falls beyond 1.5 standard deviations from

the mean, if the distribution of measurements is (a) Normal(µ, σ)?

(b) Uniform(a, b)?

4.21. The average height of professional basketball players is around 6 feet 7 inches, and the standard deviation is 3.89 inches. Assuming Normal distribution of heights within this group,

(a) What percent of professional basketball players are taller than 7 feet?

(b) If your favorite player is within the tallest 20% of all players, what can his height be?

4.22. Refer to the country in Example 4.11 on p. 91, where household incomes follow Normal distribution with µ = 900 coins and σ = 200 coins.

(a) A recent economic reform made households with the income below 640 coins qualify for a free bottle of milk at every breakfast. What portion of the population qualifies for a free bottle of milk?

(b) Moreover, households with an income within the lowest 5% of the population are enti-tled to a free sandwich. What income qualifies a household to receive free sandwiches?

4.23. The lifetime of a certain electronic component is a random variable with the expectation of 5000 hours and a standard deviation of 100 hours. What is the probability that the average lifetime of 400 components is less than 5012 hours?

4.24. Installation of some software package requires downloading 82 files. On the average, it takes 15 sec to download one file, with a variance of 16 sec2. What is the probability that the software is installed in less than 20 minutes?

4.25. Among all the computer chips produced by a certain factory, 6 percent are defective. A sample of 400 chips is selected for inspection.

(a) What is the probability that this sample contains between 20 and 25 defective chips (including 20 and 25)?

(b) Suppose that each of 40 inspectors collects a sample of 400 chips. What is the prob-ability that at least 8 inspectors will find between 20 and 25 defective chips in their samples?

4.26. An average scanned image occupies 0.6 megabytes of memory with a standard deviation of 0.4 megabytes. If you plan to publish 80 images on your web site, what is the probability that their total size is between 47 megabytes and 50 megabytes?

4.27. A certain computer virus can damage any file with probability 35%, independently of other files. Suppose this virus enters a folder containing 2400 files. Compute the probability that between 800 and 850 files get damaged (including 800 and 850).

4.28. Seventy independent messages are sent from an electronic transmission center. Messages are processed sequentially, one after another. Transmission time of each message is Exponential with parameter λ = 5 min−1. Find the probability that all 70 messages are transmitted in less than 12 minutes. Use the Central Limit Theorem.

4.29. A computer lab has two printers. Printer I handles 40% of all the jobs. Its printing time is Exponential with the mean of 2 minutes. Printer II handles the remaining 60% of jobs. Its printing time is Uniform between 0 minutes and 5 minutes. A job was printed in less than 1 minute. What is the probability that it was printed by Printer I?

4.30. An internet service provider has two connection lines for its customers. Eighty percent of customers are connected through Line I, and twenty percent are connected through Line II.

Line I has a Gamma connection time with parameters α = 3 and λ = 2 min−1. Line II has a Uniform(a, b) connection time with parameters a = 20 sec and b = 50 sec. Compute the probability that it takes a randomly selected customer more than 30 seconds to connect to the internet.

4.31. Upgrading a certain software package requires installation of 68 new files. Files are installed consecutively. The installation time is random, but on the average, it takes 15 sec to install one file, with a variance of 11 sec2.

(a) What is the probability that the whole package is upgraded in less than 12 minutes?

(b) A new version of the package is released. It requires only N new files to be installed, and it is promised that 95% of the time upgrading takes less than 10 minutes. Given this information, compute N .

4.32. Two independent customers are scheduled to arrive in the afternoon. Their arrival times are uniformly distributed between 2 pm and 8 pm. Compute

(a) the expected time of the first (earlier) arrival;

(b) the expected time of the last (later) arrival.

4.33. Let X and Y be independent Standard Uniform random variables.

(a) Find the probability of an event{0.5 < (X + Y ) < 1.5}.

(b) Find the conditional probability P{0.3 < X < 0.7 | Y > 0.5}.

This problem can be solved analytically as well as geometrically.

4.34. Prove the memoryless property of Geometric distribution. That is, if X has Geometric distribution with parameter p, show that

P{X > x + y | X > y} = P {X > x}

for any integer x, y≥ 0.

Computer Simulations and Monte Carlo